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Sin™! ^SgS£™ T0 ™ e «*ll of a microbial cell by 

Hie present invention is in the field of conversion processes using immobilized 
5 enzymes, produced by genetic engineering. 

Background of the invention 

In the detergent, personal care and food products industry there is a strong trend 
towards natural ingredients of these products and to environmentally acceptable 
0 production processes. Enzymic conversions are very important for fulfilling these 
consumer demands, as these processes can be completely natural. Moreover enzymic 
processes are very specific and consequently will produce minimum amounts of waste 
products. Such processes can be carried out in water at mild temperatures and atmos- 
pheric pressure. However enzymic processes based on free enzymes are either quite 
expensive due to the loss of enzymes or require expensive equipment, like ultra- 
membrane systems to entrap the enzyme. 

Alternatively enzymes can be immobilized either physically or chemically. The latter 
method has often the disadvantage that coupling is carried out using non-natural 
chemicals and in processes that are not attractive from an environmental point of 
view. Moreover chemical modification of enzymes is nearly always not very specific, 
which means that coupling can affect the activity of the enzyme negatively. 
Physical immobilization can comply with consumer demands, however also physical 
immobilization may affect the activity of the enzyme in a negative way. Moreover, a 
physically immobilized enzyme is in equOibrium with free enzyme, which means that 
in continuous reactors, according to the laws of thermodynamics, substantial losses of 
en^me are unavoidable. 

There are a few publications on immobilization of enzymes to microbial cells (see 
reference 1). The present invention provides a method for immobilizing enzymes to 
cell walls of microbial cells in a very precise way. Additionally, the immobilization 
does not require any chemical or physical coupling step and is very efficient 
Some extracellular proteins are known to have special functions which they can 
perform only if they remain bound to the cell wall of the host cell. Often this type of 
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protein has a long C-terminal part that anchors it in the cell wall. These C-terrninal 
parts have very special amino acid sequences. A typical example is anchoring via C- 
terminal sequences enriched in proline (see reference 2). Another mechanism to 
anchor proteins in cell walls is that the protein has a glycosyl-phosphatidyl-inositol 
5 (GPI) anchor (see reference 3) and that the C-terminal part of the protein contains a 
substantial number of potential serine and threonine glycosylation sites. 
O-Glycosylation of these sites gives a rod-like conformation to the C-terminal part of 
these proteins. Another feature of these manno-proteins is that they seem to be 
linked to the glucan in the cell wall of lower eukaryotes, as they cannot be extracted 
10 from the cell wall with SDS, but can be liberated by glucanase treatment 

Summary of the invention 

The present invention provides a method for immobilizing an enzyme, which 
comprises the use of recombinant DNA techniques for producing an enzyme or a 
15 functional part thereof linked to the cell wall of a host cell, preferably a microbial 
cell, and whereby the enzyme or functional fragment thereof is localized at the 
exterior of the cell walL Preferably the enzyme or the functional part thereof is 
immobilized by linking to the C-terminal part of a protein that ensures anchoring in 
the cell wall. 

20 In one embodiment of the invention a recombinant polynucleotide is provided 
comprising a structural gene encoding a protein providing catalytic activity and at 
least a part of a gene encoding a protein capable of anchoring in a eukaryotic or 
prokaryotic cell wall, said part encoding at least the C-terminal part of said anchoring 
protein. Preferably the polynucleotide further comprises a sequence encoding a signal 

25 peptide ensuring secretion of the expression product of the polynucleotide. Such 
signal peptide can be derived from a glycosyl-phosphatidyl-inositol (GPI) anchoring 
protein, a -factor, a-agglutinin, invertase or inulinase, cc-amylase of Bacillus, or a 
proteinase of lactic acid bacteria. The DNA sequence encoding a protein capable of 
anchoring in the cell wall can encode a-agglutinin, AGA1, FLOl or the Major Cell 

30 Wall Protein of lower eukaryotes, or a proteinase of lactic acid bacteria. The 

recombinant polynucleotide is operably linked to a promoter, preferably an inducible 
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promoter. The DNA sequence encoding a protein providing catalytic activity can 
encode a hydrolytic enzyme, e.g. a lipase, or an oxidoreductase, e.g. an oxidase. 
Another embodiment of the invention relates to a recombinant vector comprising a 
polynucleotide as described above. If such vector contains a DNA sequence encoding 

5 a protein providing catalytic activity, which protein exhibits said catalytic activity when 
present in a multimeric form, said vector can further comprise a second 
polynucleotide comprising a structural gene encoding the same protein providing 
catalytic activity combined with a sequence encoding a signal peptide ensuring 
secretion of the expression product of said second polynucleotide, said second 

10 polynucleotide being operably linked to a regulatable promoter, preferably an 
inducible or repressible promoter. 

A further embodiment of the invention relates to a chimeric protein encoded by a 
polynucleotide as described above. 

Still another embodiment is a host cell, preferably a micrc<>rgariisin, containing a 
15 polynucleotide as described above or a vector as described above. If the protein 

providing catalytic activity exhibits said catalytic activity when present in a multimeric 
form, said host cell or microorganism can further comprise a second polynucleotide 
comprising a structural gene encoding the same protein providing catalytic activity 
combined with a sequence encoding a signal peptide ensuring secretion of the 
20 expression product of said second polynucleotide, said second polynucleotide being 
operably linked to a regulatable promoter, preferably an inducible or repressible 
promoter, and said second polynucleotide being present either in another vector or in 
the chromosome of said microorganism. Preferably the host cell or microorganism has 
at least one of said polynucleotides integrated in its chromosome. As a result of 
25 culturing such host cell or microorganism the invention provides a host cell, 

preferably a microorganism, having a protein as described above immobilized on its 
cell wall. The host cell or microorganism can be a lower eukaryote, in particular a 
veast. 

The invention also provides a process for carrying out an enzymatic process by using 
30 an immobilized catalytically active protein, wherein a substrate for said catalytically 
active protein is contacted with a host cell or microorganism according to the 
invention. 
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Brief Description of the Figures 

Figure 1: DNA sequence of the 6057 bp HindUI fragment containing the complete 
AGal gene of S. cerevisiae (see SEQ ED NO: 1 and 2). The position of the unique 
Nhel site and the HindlR site used for the described constructions is specified in the 
5 header. 

Figure 2: Schematic presentation of the construction of pUR2969. The restriction sites 
for endonucleases used are shown. Abbreviations used: AG-alpha-1: Gene expressing 
a-agglutinin from S. cerevisiae 
amp: B-lactamase resistance gene 
10 PGKp: phosphoglyceratekinase promoter 
PGKt: terminator of the same gene. 

Eigflie 3: a-Galactosidase activity of S. cerevisiae MT302/1C cells and culture fluid 
transformed with pSY13 during batch culture: 
A: U/i a-galactosidase per time; the OD 530 is also shown 
15 B: a-galactosidase activity of free and bond enzyme expressed in U/OD 530 . 

Fjgnre 4 : a-Galactosidase activity of S. cerevisiae MT302/1C cells and culture fluid 
transformed with pUR2969 during batch culture: 
A: U/l a-galactosidase per time; the OD^ is also shown 
B: a-galactosidase activity of free and bond enzyme expressed in U/OD 530 . 
20 Figure 5 : Western analysis with ami a-galactosidase serum of extracellular fractions of 
cells of exponential phase (OD 530 =2). The analyzed fractions are equivalent to 4 mg 
cell walls, (fresh weight): 
A: MT302/1C expressing a-galactosidase, 

lane 1, growth medium 
25 lane 2, SDS extract of isolated cell walls 

lane 3, glucanase extract of SDS extracted cell walls; 
B: MT302/1C expressing a-Gal-AGal fusion protein, 

lane 1, growth medium 

lane 2, SDS extract of isolated cell walls 
30 lane 3, glucanase extract of SDS-extracted cell walls 

lane 4: Endo-H treated glucanase extract. 
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Figure^: Irnmunofluorescent labelling (ami a-galactosidase) of MT302/1C cells in 
the exponential phase (OD 530 =2) expressing the a<Jal-a-agglutinin fusion protein. 
Phase micrograph of intact cells A: overview B: detaiL 

figure 7 : Schematic presentation of the construction of pUR2970A, pUR2971A, 
5 pUR2972A, and pUR2973. The restriction sites for endonucleases used are indicated 
in the figure. PCR oligonucleotide sequences are mentioned in the text. 
Abbreviations used: AGal cds: coding sequence of a-agglutinin 
a-AGG=AGal: Gene expressing a-agglutinin from S. cerevisiae 

amp: B-lactamase resistance gene Pgal7=GAL7: GAL7 promoter 

10 lipolase: lipase gene of Humicola invSS: SUC2 signal sequence 

a-MF: prepro- a -mating factor sequence a-gal: a-galactosidase gene 

LEU2d : truncated promoter of LEU2 gene; 
LEU2 : LEU2 gene with complete promoter sequence. 

Figure 8 : DNA sequence of a fragment containing the complete coding sequence of 
15 lipase B of Geotrichum cawtidwn strain 335426 (see SEQ ID NO: 11 and 12). The 
sequence of the mature lipase B starts at nucleotide 97 of the given sequence. The 
coding sequence starts at nucleotide 40 (ATG). 

Fi g" re 2 : Schematic presentation of the construction of pUR2975 and pUR2976. The 
restriction sites for endonucleases used are shown. Abbreviations used: 
20 a-AGG: Gene expressing a-agglutinin from S. cerevisiae 

amp: B-lactamase resistance gene Pgal7=GAL7: GAL7 promoter 

invSS: SUC2 signal sequence a-MF: prepro-a-mating factor sequence 

LEU2d: truncated promoter LEU2 gene lipolase: lipase gene of Humicola 

. lipaseB: lipaseB gene of Geotrichum candidum. 

Figure 1Q : Schematic presentation of the construction of pUR2981 and pUR2982. The 
restriction sites for endonucleases used are shown. Abbreviations used: 
a-AGG = AG-alpha 1: Gene expressing a-agglutinin from 5. cerevisiae 
mucor lipase: lipase gene of Wiizomucor miehei 2u: 2um sequence 

Pgal7 = GAL7: GAL7 promoter invSS: SUC2 signal sequence 

a-MF: prepro-a -mating factor sequence lipolase: lipase gene of Humicola 

amp: B-lactamase resistance gene; LEU2d: truncated promoter LEU2 gene 
LEU2 : LEU2 gene with complete promoter sequence. 
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Bgmfi 11= DNA sequence (2685 bases) of the 894 amino acids coding part of the 
FLOl gene (see SEQ ID NO: 21 and 22), the given sequence starts with the codon 
for the first amino acid and ends with the stop codon. 

Figure 12 : Schematic presentation of plasmid pUR2990. Some restriction sites for en- 
5 donucleases relevant for the given cloning procedure are shown. 
Hgure 13: Schematic presentation of plasmid pUR7034. 
Figure 14: Schematic presentation of plasmid pUR2972B. 
Figure IS: Immunofluorescent labelling (anti-lipolase) of SU10 cells in the 
exponential phase (OD 530 =03) expressing the hpolase/-a-agglutinin fusion protein. 
10 A: phase micrograph B: matching fluorescent micrograph 

Detailed description of the invention 

The present invention provides a method for immobilizing an enzyme, comprising 
immobilizing the enzyme or a functional part thereof to the cell wall of a host cell, 
15 preferably a microbial cell, using recombinant DNA techniques. In particular, the C- 
terminal part of a protein that ensures anchoring in the cell wall is linked to an 
enzyme or the functional part of an enzyme, in such a way that the enzyme is 
localized on or just above the cell surface. In this way immobilized enzymes are 
obtained on the surface of cells. The linkage is performed at gene level and is 

20 characterized in that the structural gene coding for the enzyme is coupled to at least 
pan of a gene encoding, an anchor-protein in such a way that in the expression 
product the enzyme is coupled at its C-tenninal end to the C-terminal part of an 
anchor-protein. The chimeric enzyme is preferably preceded by a signal sequence that 
ensures efficient secretion of the chimeric protein. 

25 Thus the invention relates to a recombinant polynucleotide comprising a structural 
gene encoding a protein providing catalytic activity and at least a part of a gene 
encoding a protein capable of anchoring in a eukaryotic or prokaryotic cell wall, said 
pan encoding at least the C-terminal part of said anchoring protein. The length of the 
C-terminal pan of the anchoring protein may vary. Although the entire structural 

30 protein could be used, it is preferred that only a part is used, leading to a more 
efficient exposure of the enzyme protein in the medium surrounding the cell. The 
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anchoring part of the anchoring protein should preferably be entirely present. As an 
example, about the C-terminal half of the anchoring protein could be used. 
Preferably, the polynucleotide further comprises a sequence encoding a signal peptide 
ensuring secretion of the expression product of the polynucleotide. The signal peptide 
5 can be derived from a GPI anchoring protein, a-factor, a-agglutinin, invertase or 
inulinase, a-amylase of Bacillus, or a proteinase of lactic acid bacteria. 
The protein capable of anchoring in the cell wall is preferably selected form the 
group of a-agglutinin, AGA1, FLOl (fiocculation protein) or the Major Cell Wall 
Protein of lower eukaryotes, or a proteinase of lactic acid bacteria. The 
10 polynucleotide of the invention is preferably operably linked to a promoter, preferably 
a regulatable promoter, especially an inducible promoter. 

The invention also relates to a recombinant vector containing the polynucleotide as 
described above, and to a host cell containing this polynucleotide, or this vector. 
In a particular case, wherein the protein providing catalytic activity exhibits said 
catalytic activity when present in a multimeric form, such as may be the case with 
oxidoreductases, dimerisation or multimerisarion of the monomers might be a 
prerequisite for activity. The vector and/or the host cell can then further comprise a 
second polynucleotide comprising a structural gene encoding the same protein pro- 
viding catalytic activity combined with a sequence encoding a signal peptide ensuring 
20 secretion of the expression product of said second polynucleotide, said second 
polynucleotide being operably linked to a regulatable promoter, preferably an 
inducible or repressible promoter. Expression and secretion of the second 
polynucleotide after expression and secretion of the first polynucleotide will then 
result in the formation of an active multimer on the exterior of the cell wall. 
25 The host ceil or microorganism preferably contains the polynucleotide described 

above, or at least one of said polynucleotides in the case of a combination, integrated 
in its chromosome. 

The present invention relates in particular to lower eukaryotes like yeasts that have 
very stable cell walls and have proteins that are known to be anchored in the cell 
30 wall, e.g. a-agglutinin or the product of gene FLOL Suitable yeasts belong to the 
genera Candida, Debaryomyces, Hansenula, KLuyveromyces, Pichia and Saccliaromyces. 
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Also fungi, especially Aspergillus, Penicillium and Rhizopus can be used. For certain 
applications also prokaryotes are applicable. 

For yeasts the present invention deals in particular with genes encoding chimeric 
enzymes consisting of: 

a. the signal sequence e.g. derived from the a-factor-, the invertase-, the a- 
agglutinin- or the inulinase genes; 

b. structural genes encoding hydrolytic enzymes such as a-galactosidase, proteases, 
peptidases, pectinases, pectylesterase, rhamnogalacturonase, esterases and lipases, 
or non-hydrolytic enzymes such as oxidases; and 

c. the C-terminus of typically cell wall bound proteins such as a-agglutinin (see 
reference 4), AGA1 (see reference 5) and FLOl (see the non-prior published 
reference 6). 

The expression of these genes can be under the control of a constitutive promoter, 
but more preferred are regulatable, Le. repressible or inducible promoters such as the 
15 GAL7 promoter for Saccliaromyces, or the inulinase promoter for Kluyveromyces or 
the methanol-oxidase promoter for Hansenula. 

Preferably the constructs are made in such a way that the new genetic information is 

integrated in a stable way in the chromosome of the host cell. 

The invention further relates to a host cell, in particular a microorganism, having the 

20 chimeric protein described above inunobilized on its cell wall. It further concerns the 
use of such microorganisms for carrying out an enzymatic process by contacting a 
substrate for the enzyme with the microorganism Such a process may be carried out 
e.g. in a packed column, wherein the microorganisms may be supported on solid par- 
ticles, or in a stirred reactor. The reaction may be aqueous or non-aqueous. Where 

25 necessary, additives necessary for the performance of the enzyme, e.g. a co-factor, 
may be introduced in the reaction medium 

After repeated usage of the naturally immobilized enzyme system in processes, the 
performance of the system may decrease. This is caused either by physical 
denaturation or by chemical poisoning or detachment of the enzyme. A particular 
30 feature of the present invention is that after usage the system can be recovered from 
the reaction medium by simple centrifugation or membrane filtration techniques and 
that the thus collected cells can be transferred to a recovery medium in which the 
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cells revive quickly and concomitantly produce the chimeric protein, thus ensuring 
that the surface of the cells will be covered by fully active immobilized enzyme. This 
regeneration process is simple and cheap and therefore will improve the economics of 
enzymic processes and may result in a much wider application of processes based on 
5 immobilized enzyme systems. 

However, by no means the present invention is restricted to the reusability of the 
immobilized enzymes. 

The invention will be illustrated by the following examples without the scope of the 
invention being limited thereto. 

10 

EXAMPLE 1 Immobilized a-galactosidase/a-agglutinin on the surface of S. 

cerevisiae. 

The gene encoding a-agglutinin has been described by Lipke et al. (see reference 4). 
The sequence of a 6057 bp HindUl insert in pTZ18R, containing the whole AGal 

15 gene is given in Figure 1. The coding sequence expands over 650 amino acids, 
including a putative signal sequence starting at nucleotide 3653 with ATG. The 
unique Nhel site cuts the DNA at position 988 of the given coding sequence within 
the coding part of amino acid 330, thereby separating the a-agglutinin into an N- 
terminal and a C-terminal part of about same size. 

20 Through digestion of pUR2968 (see Figure 2) with Nhel/HindUl a 1.4 kb fragment 
was released, containing the sequence information of the putative cell wall anchor. 
For the fusion to a-galactosidase the plasmid pSY16 was used, an episomal vector 
based on YEplac 181, harbouring the a-galactosidase sequence preceded by the SUC2 
invertase signal sequence and placed between the constitutive PGK promoter and 

25 PGK terminator. The Styl site, present in the last nine base-pairs of the open reading 
frame of the a-galactosidase gene, was ligated to the Mzel site of the AGal gene 
fragment To ensure the in frame fusion, the Styl site was filled in and the 5' 
overhang of the Nhel site was removed, prior to ligation into the Styl/ HindUl 
digested pSY13 (see Figure 2). 

30 To verify the correct assembly of the new plasmid, the shuttle vector was transformed 
into E. coli JM109 (recAl supE44 endAl lisdR17 gyrA96 relAl thi *(lac-proAB) F 
[traD36 proAB* lacP /ocZ*M15]) (see reference 7) by the transformation protocol 
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described by Chung * al. (see reference 8). One of the positive clones, designated 
PUR2969, was further characterized, the DNA isolated and purified according to the 
Quiagen protocol and subsequently characterized by DNA sequencing. DNA 

sequencing was mainly performed as described by Sanger et al. (see reference 9), and 
5 Hsiao (see reference 10), here with the Sequenase version 2.0 kit from United States 

Biochemical Company, according to the protocol with T7 DNA polymerase 

(Amersham International pic) and [ M S]dATP«S (Amersham International pic: 370 

MBq/ml; 22 TBq/mmol). 

This plasmid was then transformed into S. ceremiae strain MT302/1C according to 

10 the protocol from Klebe ef al. (see reference 11). 

Yeast transfonnants were selected on selective plates, lacking leucine, on with 40 ,il 
(20mg/ml DMF). X-a-Gal (5-bromo^chloro-3-indolyl-«-D-glucose, Boehringer 
Mannheim) was spread, to directly test for «-galactosidase activity (see reference 12). 
To demonstrate the expression, secretion, localization and activity of the chimeric 

15 protein the following analyses were performed: 
1" — Expressio n and sew^ ^p 

S. cerevisiae strain MT302/1C was transformed with either plasmid pSY13 containing 
the «-galactosidase gene of Cyamopsis tetragonohba or plasmid pUR2969 containing 
the «-gala«osidase/«-agglutinin fusion construct. During batch culture -galactosidase 
activities were determined for washed cells and growth medium. The results are given 
in Figure 3 and Figure 4. The «-galactosidase expressed from yeast cells containing 
plasmid pSY13 was almost exclusively present in the growth medium (Figure 3A), 
whereas the «-galactosidase-a-agglutinin fusion protein was almost exclusively cell 
associated (Figure 4A). Moreover, the immobilized, cell wall-associated, a -galacto- 
sidase-a-agglutinin fusion enzyme had retained the complete activity over the whole 
incubation time, while the secreted and released enzyme lost about 90% of the 
activity after an incubation of 65 hours. This indicates, that the immobilization of the 
described enzyme into the cell wall of yeast protects the enzyme against inactivation, 
presumably through proteinases, and thereby increases the stability significantly. 
Funher insight into the localization of the different gene products was obtained by 
Western analysis. Therefore, cells were harvested by centrifugation and washed in 10 
mM Tris.HCl, p H 7.8; 1 mM PMSF at O'C and all subsequent steps were performed 
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at the same temperature. Three ml isolation buffer and 10 g of glass beads were 
added per gram of cells (wet weight). The mixture was shaken in a Griffin shaker at 
50% of its maximum speed for 30 minutes. The supernatant was isolated and the 
glass beads were washed with 1 M NaCl and 1 mM PMSF until the washes were 
5 clear. The supernatant and the washes were pooled. The cell walls were recovered by 
centrifugation and were subsequently washed in 1 mM PMSF. 
Non-covalently bound proteins or proteins bound through disulphide bridges were 
released from cell walls by boiling for 5 minutes in 50 mM Tris.HCl, pH 7.8; 
containing 2 % SDS, 100 mM EDTA and 40 mM B-mercaptoethanoL The SDS- 
10 extracted cell walls were washed several times in 1 mM PMSF to remove SDS. Ten 
mg of cell walls (wet weight) were taken up in 20 1 100 mM sodium acetate, pH 5.0, 
containing 1 mM PMSF. To this, 0.5 mU of the M,3-glucanase (I^minarase; Sigma 
L5144) was used as a source of B-13-glucanase) was added followed by incubation for 
2 hours at 37 °C. Subsequently another 0.5 mU of B-l,3-glucanase was added, 
15 followed by incubation for another 2 hours at 37 °C. 

Proteins were denatured by boiling for 5 minutes preceding Endo-H treatment Two 
mg of protein were incubated in 1 ml 50 mM potassium phosphate, pH 5.5, 
containing 100 mM B-mercaptoethanol and 05 mM PMSF with 40 mU Endo-H 
(Boehringer) for 48 hours at 37 °C Subsequently 20 mU Endo-H were added 
20 followed by 24 hours of incubation at 37 °C. 

Proteins were separately SDS-PAGE according to Laemmli (see reference 13) in 
2.2.-20% gradient gels. The gels were blotted by electrophoretic transfer onto 
Immobilon polyvinylidene-difluoride membrane (Millipore) as described by Towbin et 
al. (see reference 14). In case of highly glycosylated proteins a subsequently mild 
25 periodate treatment was performed in 50 mM periodic acid, 100 mM sodium acetate, 
pH 4.5, for several hours at 4 °C All subsequent incubations were carried out at 
room temperature. The blot was blocked in PBS, containing 0.5% gelatine and 05% 
Tween-20, for one hour followed by incubation for 1 hour in probe buffer (PBS, 0.2% 
gelatine, 0.1% Tween-20) containing 1:200 diluted serum. The blot was subsequently 
30 washed several times in washing buffer (PBS; 0.2% gelatine; 0.5% Tween-20) 

followed by incubation for 1 hour in probe-buffer containing 125 I-labelled protein A 
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(Amersham). After several washes in washing buffer, the blot was air-dried, wrapped 
in Saran (Dow) and exposed to X-omat S film (Kodak) with intensifying screen at 
-70 °C. An Omnimedia 6cx scanner and the Adobe Photoshop programme were used 
to quantify the amount of labelled protein. The results of the various protein isolation 
5 procedures from both transformants are given in Figure 5. While for the 

transformants comprising the pSY13 plasmid the overall mass of the enzyme was 
localized in the medium, with only minor amounts of enzyme more entrapped than 
bond in the cell wall (Figure 5A) -which could completely be removed by SDS extrac- 
tion- the fusion protein was tightly bound to the cell wall; with only small amounts of 
10 a-galactosidase/«-agglutinin delivered into the surrounding culture fluid or being SDS 
extractable. In contrast to the lannnarinase extraction of cell walls from cells 
expressing the free o-galactosidase, where no further liberation of any more enzyme 
was observed, identical treatment effusion enzyme expressing cells released the 
overall bulk of the enzyme. This indicates that the fusion protein is intimately 
15 associated with the cell wall glucan in S. cenvisiae, like a-agglutinin, while «-galactosi- 
dase alone is not The subsequently performed EndoH treatment showed a heavy 
glycosylate of the fusion protein, a result, entirely in agreement with the described 
extended glycosylate of the C-tenninal part of o-agglutinm. 
7u — Localizatir^ 

20 Immunofluorescent labelling with anti-a-galactosidase serum was performed on intact 
cells to determine the presence and distribution of «-galactosidase/«-agglutinin fusion 
protein in the cell wall. Immunofluorescent labelling was carried out without fixing 
according to Watzele et al. (see reference 15). Cells of OD 530 =2 were isolated and 
washed in TBS (10 mM Tris.HCl, pH 7.8, containing 140 mM Nad, 5 mM EDTA 

25 and 20 ng/ml cycloheximide). The cells were incubated in TBS + anti-«-galactosidase 
serum for 1 hour, followed by several washings in TBS. A subsequent incubation was 
carried out with FTTC-conjugated anti-rabbit IgG (Sigma) for 30 minutes. After 
washing in TBS, cells were taken up in 10 mM Tris.HCl, pH 9.0, containing 1 mg/ml 
p-phenylenediamine and 0.1% azide and were photographed on a Zeiss 68000 
30 microscope. The results of these analysis are given in Figure 6, showing clearly that 
the chimeric a-galactosidase/a-agglutinin is localized at the surface of the yeast cell. 
Buds of various sizes, even very small ones very uniformly labelled, demonstrates that 
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the fusion enzyme is continuously incorporated into the cell wall throughout the cell 
cycle and that it instantly becomes tightly linked. 
3. Activity 

To quantitatively assay a-galactosidase activity, 200 nl samples containing 0 1 M 
sodium-acetate, pH 4.5 and 10 mM p-nitrophenyl-a-D-galactopyranoside (Sigma) 
were incubated at 37 *C for exactly 5 minutes. The reaction was stopped by addition 
of 1 ml 2% sodium carbonate. From intact cells and cell walls, removed by centriru- 
gation and isolated and washed as described, the a-galactosidase activity was calcu- 
lated using the extinction coefficient of p-nitrophenol of 18.4 cm 2 /mole at 410 nm. 
One unit was defined as the hydrolysis of 1 fimole substrate per minute at 37 °C 




idase activity j n yeast <*»|k 



15 Expressed 
protein 



a-galactosidase 
aGal/aAGG fusion protein 



c-Galactosidase activity fT l/ n v w 
Growth Intact Isolated 

iDl cell 

14 -7 037 0.01 

0-54 13.3 10.9 



20 Transformed MT302/1C cells were in exponential phase (OD S30 =2). One unit is 

defined as the hydrolysis of 1 umole of p-nitrophenyl-a-D-galactopyranoside per 
minute at 37 °C. 



25 



30 



The results are summarized in Table 1. While the overall majority of a-galactosidase 
was d.stnbuted in the culture fluid, most of the fusion product was associated with the 
cells, primarily with the cell wall. Taking together the results shown in Figures 3 to 6 
and in Table 1, it could be calculated that the enzymatic a-galactosidase activity of 
the chimeric enzyme is as good as that of the free enzyme. Moreover, during 
stationary phase, the activity of the a-galactosidase in the growth medium decreased 
whereas the activity of the cell wall associated a-galactosidase a-agglutinin fusion 
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remained constant, indicating that the cell associated fusion protein is protected from 
inactivation or proteolytic degradation. 

N.B. The essence of this EXAMPLE was published during the priority year by M.P. 
5 Schreuder et al. (see reference 25). 



EXAMPLE 2A Immobilized Humicola lipase/a -agglutinin on the surface of S, 

cerevisiae. (inducible expression of immobilized enzyme system) 
The construction and isolation of the 1.4 kb Nhel/HindHl fragment containing the C- 

10 terminal part of a-agglutinin has been described in EXAMPLE 1. Plasmid pUR7021 
contains an 894 bp long synthetically produced DNA fragment encoding the lipase of 
Humicola (see reference 16 and SEQ ID NO: 7 and 8), cloned into the 
EcdKL/HinWi restriction sites of the commercially available vector pTZ18R (see 
Figure 7). For the proper one-step modification of both the 5' end and the 3' end of 

15 the DNA part coding for the mature lipase, the PCR technique can be applied. 
Therefore the DNA oligonucleotides lipol (see SEQ ID NO: 3) and lipo2 (see SEQ 
ID NO: 6) can be used as primers in a standard PCR protocol, generating an 826 bp 
long DNA fragment with an Eagl and a HindHl restriction site at the ends, which can 
be combined with the larger part of the Eagl/HindlE digested pUR2650, a plasmid 

20 containing the cc-galactosidase gene preceded by the invertase signal sequence as des- 
cribed earlier in this specification, thereby generating plasmid pUR2970A (see 
Figure 7). 
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PCR oligonucleotides for the in-frame linkage oiHumicola lipase and the C- 
terminus of a agglutinin. 

5 " N SermK^ 

>mature lipase 

primer lipol: 5 '-GGG GCg^C SA G G?c T^G c£a GAT CTG GA-3* 

sas=«, — ••ar^3!« « ® m Si &U, 

15 ^ f^f 8011 " 01 , 60 ^/ 0 ' 1116 1,1 frame ttailsi,ion betwecn C-terminus of lipase 
and C-terminal part of ot-agglutinin. 

20 Svj^r tti lii Wi Wi 111 Fh 111 ™ tcc ga - 3 

(for the part of the lipase coding strand se/s!£ ID NO,"if 111 

25 Through the PCR method a Miel site will be created at the end of the coding 
sequence of the Upase, allowing the fa-frame linkage between the DNA coding for 
lipase and the DNA coding for the C-terminal part of ^-agglutinin. Plasmid 
PUR2970A can then be digested with M«el and Hindm and the 1.4 kb Nhel/HmdlU 
fragment containing the C-tenninal part of «-agglutinin from plasmid pUR2968 can 
30 be combined with the larger part of Nhel and Hm<MI treated plasmid pUR2970A 
resulting in plasmid pUR2971A From this plasmid the 2.2 kb Eagl/HuuW fragment 
can be isolated and ligated into the Eagl- and i/wdHI-treated pUR2741, whereby 
plasmid P UR2741 is a derivative of pUR2740 (see reference 17), where'the second 
£agl restriction site in the already inactive Tet resistance gene was deleted through 
35 Nrul/San digestion. The Safl site was filled in prior to religation. The ligation then 
results in pUR2972A containing the GAL 7 promoter, the invertase signal sequence, 
the chimeric lipases-agglutinin gene, the 2 urn sequence, the defective Leu2 promo- 
ter and the Leu2 gene. This plasmid can be used for transforming S. cereuisiae and 
the transformed cells can be cultivated in YP medium containing galactose as an 
0 inducer without repressing amounts of glucose being present, which causes the 
expression of the chimeric lipase/o-agglutinin gene. 
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T*e expression, secretion, localization and activity of the chimeric lipase/« -agglutinin 
can be analyzed using similar procedures as given in EXAMPLE .!. 



In a similar way variants of Humicola lipase, obtained via rDNA techniques 
linked to the C-temiinal part of a-agglutinin, which variants can have a higher 
stability during (inter)esterification processes. 



can be 



EXAMPLE 2B Immobile Htmicola l^/^^ Qn ^ ^ rf & 

in C va™ (teducible expression of immobilized enzyme system) 

EXAMPLE 2A describes a protocol for preparing a particular construct Before 
carrying out the work it was considered more convenient to use the expression vector 
described in EXAMPLE 1, so that the construction route given in this EXAMPLE 2B 
d.ffers on minor points from the construction route given in EXAMPLE 2A and the 
resultmg plasmids are not identical to those described in EXAMPLE 2A. However 
the essential gene construct comprising the promoter, signal sequence, and the 
structural gene encoding the fusion protein are the same in EXAMPLES 2A and 2B. 

1- CP r "f t Ty f i"n 



20 



Th* construction and isolation of the 1.4 kb Nhel/HinMI fragment encoding the C- 
ternunal part of -agglutinin cell wall protein has been described in EXAMPLE 1 
The plasmid pTJR 7 033 (resembling pUR7021 of EXAMPLE 2A) was made by 
treatmg the commercially available vector pTZ18R with feoRI and HinW and 
hgatmg the resulting vector fragment with an 894 bp long synthetically produced 
DNA £coRI/^din fragment encoding the lipase of Humicola (see SEQ ID NO- 7 
25 and 8, and reference 16). 



30 



For the fusion of the lipase to the C-tenninal, cell wall anchor-comprising domain of 
"-agglutinin, plasmid pUR7033 was digested with Eagl and Hinim, and the lipase 
codmg sequence was isolated and ligated into the Ea 8 l- and /fc,dIII-digested yeas, 
expression vector pSYl (see reference 27), thereby generating P UR7034 (see Figure 
13). Tms is a 2»m episomal expression vector, containing the a-galactosidase gene 
described in EXAMPLE 1, preceded by the invertase (SUC2) signal sequence under 
the control of the inducible GAL7 promoter. 
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Parallel to this digestion, pUR7033 was also digested with EcoRV and Hintm, 
thereby releasing a 57 bp long DNA fragment, possessing codons for the last 15 car- 
boxyterminal amino acids. This fragment was exchanged against a small DNA frag- 
mem, generated through the hybridisation of the two chemically synthesized 
deoxyoligonucleotides SEQ ID NO: 9 and SEQ ID NO: 10. After annealing of both 
DNA strands, these two oligonucleotides essentially reconstruct the rest of the 3' 
coding sequence of the initial lipase gene, but additionally introduce downstream of 
the lipase gene a new Nhel restriction site, followed by a Hindm site in close vicinity 
whereby the first three nucleotides of the Nhel site form the codon for the last amino 
acid of the lipase. The resulting plasmid was designated pUR2970B. Subsequently, 
this construction intermediate was digested with Eagl and Nhel, the lipase encoding 
fragment was isolated, and, together with the 1.4 kb Nhel/Hindm fragment of 
PUR2968 ligated into the Eagl- and Mi-cut pSYl vector. The outcome of this 3- 
pomt-hgation was called pUR2972B (see Figure 14), the final Upolase-«.agglutinin 
15 yeast expression vector. 

TTris plasmid was used for transforming S. cerevisiae strain SU10 as described in 
reference 17 and the transformed cells were cultivated in YP medium containing 
galactose as the inducer without repressing amounts of glucose being present, which 
causes the expression of the chimeric hpase/«-agglutinin gene. 
20 2. Activity 

To quantify the lipase activity, two activity measurements with two separate substrates 
were performed. In both cases, SU10 yeast cells transformed with either plasmid 
PUR7034 or pSYl served as control. Therefore, yeast cell transfonnants containing 
either plasmid pSYl or plasmid pUR7034 or plasmid pUR2972B were grown up for 
24h in YNB-glucose medium supplied with histidine and uracil, then diluted M0 in 
YP-medium supplied with 5% galactose, and again cultured. After 24h incubation at 
30°C, a first measurement for both assays was performed. 

The first assay applied was the pH stat method. Within this assay, one unit of lipase 
actmty is defined as the amount of enzyme capab.e of liberating one micromole of 
fatty acd per minute from a triglyceride substrate under standard assay conditions (30 
ml assay solution containing 38 mM olive oil, considered as pure trioleate, emulsified 
with 1:1 w/w gum arabic, 20 mM calcium chloride, 40 mM sodium chloride, 5 mM 
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Tris, pH 9.0, 30°C) in a radiometer pH star apparatus (pHM 84 pH meter, ABU 80 
autoburette, TTA 60 titration assembly). The fatty acids formed were titrated with 
0.05 N NaOH and the activity measured was based on alkali consumption in the 
interval between 1 and 2 minutes after addition of putative enzyme batch. To test for 
immobilized lipase activity, 1 ml of each culture was centrifuged, the supernatant was 
saved, the pellet was resuspended and washed in 1 ml 1 M sorbitol, subsequently 
again centrifuged and resuspended in 200ul 1 M sorbitol. From each type of yeast cell 
the first supernatant and the washed cells were tested for lipase activity. 



10 



A: Lipase activity after 24h (LU/ml) 

cell bound 
pSYl 5.9 
pUR7034 24.1 
pUR2972B-(l) 18.7 
15 pUR2972B-(2) 24.6 

B: Lipase activity after 48h (LU/ml) 



culture fluid 

8.8 
632.0 
59.6 
40.5 



pSYl 
20 pUR7034 . 
pUR2972B-(l) 
pUR2972B-(2) 



cell bound 

6.4 
215.0 
37.0 - 
34.0 



culture fluid 

43 
2750.0 

87.0 

82.0 



OD660 

MO 

-40 

"40 

"40 
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The rest of the yeast cultures was further incubated, and essentially the same 
separation procedure was done after 48 hours. Dependent on the initial activity 
measured, the actual volume of the sample measured deviated between 25^1 and 
150jil. 

This series of measurements indicates, that yeast cells comprising the plasmid coding 
for the lipase-a-agglutinin fusion protein in fact express some lipase activity which is 
associated with the yeast cell. 
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An addmonal second assay was performed to farther confirm the immobihzation of 
u*n* of hpase on the yeast ceU surface. Briefly, within this assay, the kinetics of the 
PNP (=paranitrophenyl) release from PNP-butyrate is determined by measurement of 
the OD at 400 nn, Therefore, 10 ml cultures containing yeast cells with either pSYl 

5 P U ^034orpUR2972Bwerecentrimge d ,mepe U etwasr«u S pendedm4mlof ' 
buffer A (0.1 M NaOAc, pH 5.0 and 1 mM PMSF ), from this 4 ml 500,1 was 
centrifuged again and resuspended in 500 ,1 PNB-buffer (20 mM Tris-HO, pH 9 0 
20 mM Ca02, 25 mM Nad), centrifuged once again, and finally resuspended in ' 
400,1 PNB buffer. This fraction was used to determine the cell bound fraction of 
10 lipase. 

Ibe ren,aining 3500,1 were spun down, the pellet was resuspended in 4 ml A, to each 
of tms, 40,1 laminarinase (ex mollusc, L25 mU/,1) was added and first incubated for 

hours at 37°Q followed by an overnight incubation at 20°C Iben the reaction 
nuxture, still containing intact cells, were centrifuged again and the supernatant was 
used to determined the amount of originally cell wall bound material released 
tnrough laminarinase incubation. Ibe final pellet was resuspended in 400,1 PNP 
buffer, to calculate the still cell associated part Ibe blank reaction of a defined 
volume of specific culture fraction in 4 ml assay buffer was determined, and than the 
reacnon was started through addition of 80,1 of substrate solution (100 mM PNP- 
0 butyrate in methanol), and the reaction was observed at 25°C at 400 nm in a 
spectrophotometer. 



OD660 
2.6 



cell bound activity in laminarinase laminarinase 

„ a£tiidlx! luSUnesiiffiB extract gells 

25 PSYl 0.001 (116,1) 0.001 0.028 0.000 

PUR7034 0.293 (220,1) 0.446 0.076 0.985 2 36 

PUR2972B-(1) 0.494(143,.) offll 0.170 0.208 2 . 10 

* unless otherwise mentioned, the volume of enzyme solution added was 20,1 

This result positively demonstrates that a significant amount of lipase activity is 
nmnobmzed on the surface yeast cell, containing p.asmid pUR2972B. Here again, 
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incorporation took place in such a way, that the reaction was catalyzed by cell wall 
inserted lipase of intact cells, indicated into the exterior orientated iinmobwzation. 
Furthermore, the release of a significant amount of lipase activity after incubation 
with lanunarinase again demonstrates the presumably covalent incorporation of a 
5 heterologous enzyme through gene fusion with the ^terminal part of a-agglutinia 

The expression, secretion, and subsequent incorporation of the Hpase-a-agglutinin 
fusion protein into the yeast cell wall was also confirmed through immunofluorescent 
labelling with anti-Iipolase serum essentially as described in EXAMPLE L item 
10 2. Localization. 

As can be seen in Figure 15, the immunofluorescent stain shows essentially an 
analogous picture as the a-galactosidase immuno stain, with clearly detectable 
reactivity on the outside of the cell surface (see Figure 15 A showing a clear halo 
around the cells and Figure B showing a lighter circle at the surface of the cells), but 

15 neither in the medium nor in the interior of the cells. Yeast cells expressing 

PUR2972B, the Humicola lipase-a-agglutinin fusion protein, become homogeneously 
stained on the surface, indicating the virtually entire immobilization of a chimeric 
enzyme with an a-agglutinin C-terminus on the exterior of a yeast cell. In the 
performed control experiment SU10 yeast cells containing plasmid pUR7034 served 

20 as a control and here, no cell surface bound reactivity against the applied anti-lipase 
serum could be detected 

In a similar way variants of Humicola lipase, obtained via rDNA techniques, can be 
linked to the C-terminal part of ^-agglutinin, which variants can have a higher 
stability during (inter)esterification processes. 

25 

EXAMPLE 3 Immobilized Humicola lipase/a-agglutinin on the surface of S. 

cereviaae (constitutive expression of immobilized enzyme system) 
Plasmid pUR2972 as described in EXAMPLE 2 can be treated with Eagl and HinMl 
and the about 22 kb fragment containing the lipase/a-agglutinin gene can be 
30 isolated. Plasmid pSY16 can be restricted with Eagl and HindUI and between these . 
sites the 2.2 kb fragment containing the lipase/a-agglutinin fragment can be ligated 
resulting in pUR2973. The part of this plasmid that is involved in the production of 
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the chimeric enzyme is similar to pUR2972 with the exception of the signal sequence 
Whereas pUR2972 contains the Sf/C2-invertase-signal sequence, pUR2973 contains 
the «-mating factor signal sequence (see reference 18). Moreover the plasmid 
PUR2973 contains the Leu2 marker gene with the complete promoter sequence, 
5 instead of the truncated promoter version of pUR2972. 

EXAMPLE 4 Immobilized Geotrichum lipase/a-agglutinin on the surface of 51 

cerevisiae 

Hie construction and isolation of the 1.4 kb Nhel/Hindm fragment comprising the 
10 C-terminal part of AGa-1 («-agglutinin) gene has been described in EXAMPLE 1 
For the in-frame gene fusion of the DNA coding for the C-terminal membrane 
anchor of a-agglunnin to the complete coding sequence of Geotrichum candidum 
lipase B from strain CMICC 335426 (see Figure 8 and SEQ ID NO: 11 and 12) the 
plasmid pUR2974 can be used. His plasmid, derived from the commercially available 
15 pBluescript H SK plasmid, contains the cDNA coding for the complete G. candidum 
lipase H on an 1850 bp long EcoTU/Xlwl insert (see Figure 9). 
To develop an expression vector for S. cerevisiae with homologous signal sequences 
the N-tenninus of the mature lipase B was determined experimentally by standard 
techniques. The obtained amino acid sequence of "Gln-Ala-Pro-Thr-Ala-VaL " is in 
20 complete agreement with the cleavage site of the signal peptidase on the G. candidum 
lipase II (see reference 19). 

For the fusion of the mature lipase B to the S. cerevisiae signal sequences of SUC2 
(invertase) or a-mating factor (prepro-«MF) on one hand and the in-frame fusion to 
the 3" pan of the AG« 1 gene PGR technique can be used. Tne PGR primer lipo3 
25 (see SEQ ID NO: 13) can be constructed in such a way, that the originally present 
Eagl sue in the 5'-part of the coding sequence (spanning codons 5-7 of the mature 
protem) will become inactivated without any alteration in the amino acid sequence 
To fachtate the subsequent cloning procedures, the PCR primer can further contain 
a new Eagl site at the 5' end, for the in-frame ligation to SUC2 signal sequence or 
prepro-«MF sequence, respectively. The corresponding PCR primer lipo4 (see SEQ 
ID NO: 16) contains an extra Nhel site behind the nucleotides coding for the 
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C-terminus of lipase B, to ensure the proper fusion to the C-tenninal part of 
a-agglutinin. 

PCR oligonucleotides for the in frame linkage of G. candidum lipase D to the 
SUC2 signal sequence and the C-tenninal part of a-agglutinin. 

a: N-terminal transition to either prepro «MF sequence or SUC2 signal sequence 

P^er Upo3: OOC CCA AGG CGG TCT c£c AAT-3- 

ass 8ee S I " J * « 



25 



15 b: C-tenninal fusion to C part of a-agglutinin 

2n tSr^T ^ T " gL , iff m m m & ffi n « — 
20 ^ lipo4s 3 _i m Hf mi hi jy m m ^ ^ 

(for the part of the lipase coding str and see SEQ ID NO:^If 

The PCR product with the modified ends can be generated by standard PCR 
protocols, using instead of the normal Ampli-7^ polymerase the new thermostable 
VENT polymerase, which also exhibits proofreading activity, to ensure an error-free 
DNA template. Through digestion of the formerly described plasmid pUR2972 with 
Eagl (complete) and Nhel (partial), the Humicola lipase fragment can be exchanged 
agamst the DNA fragment coding for lipase B, thereby generating the final S. 
30 cerevisiae expression vector pUR2975 (see Figure 9). 

The Humicola hpase-«-agglutinin fusion protein coding sequence can be exchanged 
agamst the lipase B/a-agglunnin fusion construct described above by digestion of the 
described vector pUR2973 with Eagl/HinWl, resulting in pUR2976 (see Figure 9) 



35 



EXAMPLE S Immobilized Rhizomucor miehei lipase/«-aggl„ t i„ in on the surface 

of S. cerevisiae 

The construction and isolation of the 1.4 kb Nhel/HMSm fragment encoding the 
C-terminal part of a-agglutinin has been described in EXAMPLE 1. The plasmid 
PUR2980 contains a 1.25 kb cDNA fragment cloned into the Smal site of 
40 commercially available pUCIS, which (synthetically synthesizable) fragment encodes 
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the complete coding sequence of triglyceride lipase of Rhizomucor miehei (see 
reference 20), an enzyme used in a number of processes to mteresterify 
triacylglycerols (see reference 21) or to prepare biosurfactants (see reference 22). 
Beside the 269 codons of the mature lipase molecule, the fragment also harbours 
5 codons for the 24 amino acid signal peptide as well as 70 amino acids of the 

propeptide. PCR can easily be applied to ensure the proper fusion of the gene frag- 
ment encoding the mature lipase to the SUC2 signal sequence or the prepro a-mating 
factor sequence of 5. cerewiae, as well as the in-frame fusion to the described 
Nhel/Hindm fragment. The following two primers, lipoS (see SEQ ID NO: 17) and 
lipo6 (see SEQ ID NO: 20), will generate a 833 bp DNA fragment, which after 
Proteinase K treatment and digestion with Eagl and Nhel can be cloned as an 816 bp 
long fragment into the Eagl/Nhel digested plasmids pUR2972 and pUR2973, 
respectively (see Figure 7). 

i • c Eagl A s I d G g I 

i-> lipoS: S'-CCC GCG GCC CCT, AGC ATT GAT GGT GGT ATC-3 * 

lin^ErSe 3-411 III ME Jll Ul IILs. 

tror the part of the lipase non-coding strand see SEQ ID NO: 18) 

20 

lipase (cod. strand) s S'-AAC ACA GGC CTC TGT ACT-3' 
LipoS: 3 -M! Ht' JU g'aU JJ| CGATCGCGCC-5- 

25 (for the part of the lipase coding strand see SEQ ID^NO: 19) 

These new S. cerevisioe expression plasmids contain the GAL7 promoter, the invertase 
signal sequence (pUR2981) or the prepro-« -mating factor sequence (pUR2982), the 
chimeric Rhizomucor miehei lipase/a-agglutinin gene, the 2 urn sequence, the ' 
defective (truncated) Leu2 promoter and the Leu2 gene. These plasmids can be 
transformed into S. cerevisioe and grown and analyzed using protocols described in 
earlier EXAMPLES. 

EXAMPLE 6 Immobilized Aspergillus niger glucose oxidase/GPI anchored cell 
35 wal1 Proteins on the surface of S. cerevisiae 

Glucose oxidase (B-D:oxygen 1-oxidoreductase, EC 1.13.4) horn Aspergillus niger 
catalyses the oxidation of 6-D-glucose to glucono-6-lactone and the concomitant 
reduction of molecular oxygen to hydrogen peroxide. The fungal enzyme consists of a 
homodimer of molecular weight 150,000 containing two tightly bound FAD co-factors. 
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Beside the use in glucose detection kits the enzyme is useful as a source of hydrogen 
peroxide in food preservation. The gene was cloned from both cDNA and genomic 
libraries, the single open reading frame contains no intervening sequences and 
encodes a protein of 605 amino adds (see reference 23). 
5 With the help of two proper oligonucleotides the coding part of the sequence is 
adjusted in a one-step modifying procedure by PCR in such a way that a fusion gene 
product will be obtained coding for glucose oxidase and the C-terminal cell wall 
anchor of the FLOl gene product or a-agglutinin. Thus, some of the plasmids 
described in former EXAMPLES can be utilized to integrate the corresponding 
10 sequence in-frame between one of the signal sequences used in the EXAMPLES and 
the Nhel/Hindm part of the AGel gene. 

Since dimerisation of the two monomers might be a prerequisite for activity, in an 
alternative approach the complete coding sequence for glucose oxidase without the 
GPI anchor can be expressed in S. cerevisiae transformant which already contains the 
15 fusion construct This can be fulfilled by constitutive expression of the fusion construct 
containing the GPI anchor with the help of the GAPDH or PGK promoter for 
example. The unbound not-anchored monomer can be produced by using a DNA 
construct comprising an inducible promoter, as for instance the GAL7 promoter. 
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EXAMPLE 7 Process- to convert raffinose, stachyose and similar sugars in soy 

extracts with o-galactosidase/a-agglutinin immobilized on yeasts 
The yeast transformed with plasmid pUR2969 can be cultivated on large scale. At 
regular intervals during cultivation the washed cells should be analyzed on the 

25 presence of «-galactosidase activity on their surface with methods described in 
EXAMPLE 1. When both cell density and a-galactosidase activity/biomass reach 
their maximum, the yeast cells can then be collected by centrifugation and washed. 
The washed cells can then be added to soy extracts. The final concentration of the 
yeast cells can vary between 6.1 and 10 g/1, preferably the concentration should be 

30 above 1 g/1. The temperature of the soy extract should be < 8 °C to reduce the 
metabolic activity of the yeast cells. The conversion of raffinose and stachyose can be 
analyzed with HPLC methods and after 95 % conversion of these sugars the yeasts 
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cells can be removed by centrifagation and their a-galactosidase activity/g biomass 
can be measured. Centrifugates with a good activity can be used in a subsequent 
conversion process, whereas centrifugates with an activity of less then 50 % of the 
original activity can be resuscitated in the growth medium and the cells can be 
5 allowed to recover for 2 to 4 hours. Thereafter the cells can be centrifuged, washed 
and subsequently be used in a subsequent conversion process. 

EXAMPLE 8 Production of biosurfactants using Humicola 

lipase/a-agglutinin immobilized on yeasts. 
10 The yeast transformed with plasmid pUR2972 or pUR2973 can be cultivated on large 
scale. At regular intervals during cultivation the washed cells can be analyzed on the 
presence of lipase activity on their surface with methods described in EXAMPLE 1 
When both cell density and lipase/biomass reache their maximum, the yeast cells can 
be collected by centrifugation and washed. The washed cells can be suspended in a 
15 small amount of water and added to a reactor tank containing a mix of fatty acids 
preferably of a chain length between 12-18 carbon atoms and sugars, preferably 
glucose, galactose or sucrose. The total concentration of the water (excluding the 
water in the yeast cells) might be below 0.1 %. Tie final concentration of the yeast 
cells can vary between 0.1 and 10 g/1, preferably the concentration is above 1 g/1 The 
20 tank has to be kept under an atmosphere of N 2 and C0 2 in order to avoid oxidation 
of the (unsaturated) fatty acids and to nunimize the metabolic activity of the yeasts 
The temperature of mixture in the tank should be between 30-60 "C, depending on 
type of fatty acid used. The conversion of fatty acids can be analyzed with GLC 
methods and after 95 % conversion of these fatly acids the yeasts cells can be 
25 removed by centrifugation and their lipase activity/g biomass can be measured. 
Centrifugates with a good activity can be used in a subsequent conversion process 
whereas centrifugates with an activity of less then 50 % of the original activity can be 
resuscitated in the growth medium and the cells can be allowed to recover for 2 to 8 
hours. Thereafter the cells can be centrifuged again, washed and used in a subsequent 
30 conversion process. 
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EXAMPLE 9 Production of special types of triacylglycerols using Rhizomucor 

miehei iipase/a-agglutinin immobilized on yeasts. 
The yeast transformed with plasmid pUR2981 or pUR2982 can be cultivated on a 
large scale. At regular intervals during cultivation the washed cells can be analyzed on 
5 the presence of lipase activity on their surface with methods described in EXAMPLE 
1. When both cell density and lipase/biomass reach their maximum, the yeast cells 
can be collected by centrifugation and washed The washed cells can be suspended in 
a small amount of water and can be added to a reactor tank containing a mix of 
various triacylglycerols and fatty acids. The total concentration of the water (excluding 

10 the water in the yeast cells) might be below 0.1 %. The final concentration of the 
yeast cells can vary between 0.1 and 10 g/1, preferably the concentration is above 1 
g/1. The tank has to be kept under an atmosphere of N 2 and COj in order to avoid 
oxidation of the (unsaturated) fatty acids and to minimize the metabolic activity of 
the yeasts. The temperature of mixture in the tank should be between 30-70 °C, 

15 depending on types of triacylglycerol and fatty add used. The degree of interesteri- 
fication can be analyzed with GLC/MS methods and after formation of at least 80 % 
of the theoretical value of the desired type of triacylglycerol the yeasts cells can be 
removed by centrifugation and their lipase activity/g biomass can be measured. 
Centrifugates with a good activity can be used in a subsequent conversion process, 

20 whereas centrifugates with an activity of less then 50 % of the original activity is 
resuscitated in the growth medium and the cells should be allowed to recover 2 to 8 
hours. After that the cells can be centrifuged, washed and used in a subsequent inter- 
esterificauon process. 

Baker's yeasts of strain MT302/1C, transformed with either plasmid pSY13 or 
25 plasmid pUR2969 (described in EXAMPLE 1) were deposited under the Budapest 
Treaty at the Centraalbureau voor Schimmelcultures (CBS) on 3 July 1992 under 

> 

provisional numbers 330.92 and 329.92, respectively. 

EXAMPLE 10 Immobilized Humicola lipase/FLOl fusion on the surface of 5. 

30 cerevisiae 

Flocculation, defined as "the (reversible) aggregation of dispersed yeast cells into 
floes" (see reference 24), is the most important feature of yeast strains in industrial 
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fermentations. Beside this it is of principal interest, because it is a property associated 
with cell wall proteins and it is a quantitative characteristic One of the genes 
associated with the flocculation phenotype in S. cersvisiae is the FLOl gene. The gene 
is located at approximately 24 kb from the right end of chromosome I and the DNA 
5 sequence of a clone containing major parts of FLOl gene has very recently been 
determined (see reference 26). The sequence is given in Figure 11 and SEQ ID NO: 
21 and 22. The cloned fragment appeared to be approximately 2 kb shorter than 
the genomic copy as judged from Southern and Northern hybridizations, but encloses 
both ends of the FLOl gene. Analysis of the DNA sequence data indicates that the 

10 putative protein contains at the N-terminus a hydrophobic region which confirms a 
signal sequence for secretion, a hydrophobic C-terniinus that might function as a 
signal for the attachment of a GPI-anchor and many glycosylation sites, especially in 
the C-terminus, with 46,6 % serine and threonine in the arbitrarily defined C-termi- 
nus (aa 271-894). Hence, it is likely that the FLOl gene product is localized in an 

15 orientated fashion in the yeast cell wall and may be directly involved in the process of 
interaction with neighbouring cells. The cloned FLOl sequence might therefore be 
suitable for the immobilization of proteins or peptides on the cell surface by a dif- 
ferent type of cell wall anchor. 

Recombinant DNA constructs can be obtained, for example by utilizing the DNA 
20 coding for. amino acids 271-894 of the FLOl gene product, i.e. polynucleotide 

811-2682 of Figure 11. Through application of two PCR primers pcrflol (see SEQ ID 
NO: 23) and pcrflo2 (see SEQ ID NO: 26) Nhel and HindUI sites can be introduced 
at both ends of the DNA fragment In a second step, the 1.4 kb Nhel/Hindm 
fragment present in pUR2972 (either A or B) containing the C-terminal part of 
25 a-agglutinin can be replaced by the 1.9 kb DNA fragment coding for the C-terminal 
part of the FLOl protein, resulting in plasmid pUR2990 (see Figure 12), comprising a 
DNA sequence encoding (a) the invertase signal sequence (SUC2) preceding (b) the 
fusion protein consisting of (b.l) the lipase of Humicola (see reference 16) followed 
by (b.2) the C-terminus of FLOl protein (aa 271-894). 
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PCR oligonucleotides for the in frame connection of the genes encoding the 
Humicola lipase and the C-terminal part of the FLOl gene product 

s N y a v s T 

primer pcrflol 5'- GAATTC GCT AGC AAT TAT GCT GTC AGT ACC - 3» 

5 MM— HI HI HI HI HI Ml 

FLOl gene (non-coding strand) 3'- AGT TTA ATA CGA GAG TCA TGG TGA - 5' 
(for the part of the non-coding strand see SEQ ID NO: 24) 

10 FLOl coding strand 5 1 -AATAA AATTCGCGTTCTTTTTACG - 3' 

iiiiiimmiimii 

primer pcrflo2: 3 * -TTAAGCGCAAGAAAAATGC TTCGAACTCGAG - 5* 

Hindlll 

(for the part of the coding strand see SEQ ID NO: 25) 

15 

Plasmid pUR2972 (either A . or B) can be restricted with Nhel (partial) and HindW 
and the Nhel/HindUI fragment comprising the vector backbone and the lipase gene 
can be ligated to the correspondingly digested PCR product of the plasmid containing 
the FLOl sequence, resulting in plasmid pUR2990, containing the GAL7 promoter, 
20 the 5. cerevisiae invertase signal sequence, the chimeric lipase/FLOl gene, the yeast 2 
\im sequence, the defective Leu2 promoter and the Leu2 gene. This plasmid can be 
transformed into S. cerevisiae and the transformed cells can be cultivated in YP 
medium including galactose as inductor. 

The expression, secretion, localization and activity of the chimeric lipase/FLOl 
25 protein can be analyzed using similar procedures as given in Example 1. 
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SEQUENCE LISTING 

(1) GENERAL INFORMATION; 

(i) APPLICANT: 

(A) NAME: Unilever N.V. 

(B) STREET: Weena 455 

(C) CITY: Rotterdam 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP): NL-3013 AL 

(A) NAME: Unilever PLC 

(B) STREET: Unilever House Blackfriars 

(C) CITY; London 

(E) COUNTRY: United Kingdom 

(F) POSTAL CODE (ZIP): EC4P 4BQ 

(A) NAME: Franciscus Maria KLIS 

(B) STREET: Benedenlangs 102 

(C) CITY: Amsterdam 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP): NL-1025 KL 

(A) NAME: Maarten Pleun SCHREUDER 

(B) STREET: Rode Kruislaan 1220 

(C) CITY: Diemen 

(E) COUNTRY: The Netherlands 
' (F) POSTAL CODE (ZIP): NL-1111 XB 

(A) NAME: Holger York TOSCHKA 

(B) STREET: Coornhertstraat 77 

(C) CITY: Vlaardingen 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP): NL-3132 GB 

(A) NAME: Cornells Theodorus VERRIPS 

(B) STREET: Hagedoorn 18 

(C) CITY: Maassluis 

(E) COUNTRY : : The Netherlands 

(F) POSTAL CODE (ZIP): NL-3142 KB 

(ii) TITLE OF INVENTION: Enzymic Processes based on naturally 
immobilized enzymes that can easily be separated and 
regenerated 
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(iii) NUMBER OP SEQUENCES: 26 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6057 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cerevisiae 

<ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 3653.. 5605 

(D) OTHER INFORMATION: /function= "sexual agglutinisation" 



/product= "alpha- agglutinin" 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 1: 



AAGCTTTAGG TAAGGGAGGC AGGGGGAAAA GATACTGAAA TGACGGAAAA CGAGAATATG 



60 



GAGCAGGGAG CAACTTTTAG AGCTTTACCC GTTAAAAGGT CAAATCGAGG CTTCCTGCCT 



120 



TTGTCTGATT TTAGTAGTAC CGGAAGGTTT ATTACGCCCA AGAACAGTGC TTGAATTGAG 



180 



TTCTCGGGAC ACGGGAAAGA CAATGGAAGA AAAATTTACA TTCAGTAGCC TTATATATGA 



240 



AATGCTGCCA AGCCACGTCT TTATAAGTAG ATAATGTCCC ATGAGCTGAA CTATGGGAAT 



300 



TTATGACGCA GTTCATTGTA TATATATTAC ATTAACTCTT TAGTTTAACA TCTGAATTGT 



360 



TTTATAAAAT AACTTTTTGA ATTTTTTTAT GATCGCTTAG TTAAGTCTAT TATATCAGGT 



420 



TTTTTCATTC ATCATAATTG TTCGTTAAAT ATGAGTATAT TTAAATACAG GAATTAGTAT 



480 
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CATTTGCAGT CACGAAAAGG GCCGTTTCAT AGAGAGTTTT CTTAATAAAG TTGAGGGTTT 540 

CCGTGATAGT TTTGAGGGGT TGTTTGAACT AGATTTACGC TTACCTTTCA ACTGATTAAT 600 

TTTTTCAGCG GGCTTATCAT AATCATCCAT CATAGCAGTC TTTCTGGACT TCGTCGAGGA 660 

CTGGCTTTCT GAATTTTGAC GGTCCCTATT AGCTCCAGTT GGAGGAATTG AGTTACCTAC 720 

AACTGGCAAG AGGTCTTTGT TTGGATTCAA AATAGGACTT TGTGGTAGCA GTTTGGTTTT 780 

ATTCAATCTA AAGATATGAG AAACAGGTTT TAAGTAAATC GATACTATTG TACCAATGTT 840 

TAGCTCCAAT TCCTCCAAAA CGGTGGGATC TAATTTTGTG TTCATTTCTA TTAGTGGCAA 900 

CTCTCCGTCC AGTACTGATT TTAAAGATTC AAAAGTTATC GCGTTTGATA TACGAGACGT 960 

TTTCGTTAAT GACAGCAATC TCCAATACAT GAGTGTTTTA TCTCTTAAGT CAGGATTATT 1020 

TTCGTGATCG GTGCATCCTT TTAATAAATC CATACAAAGT TCTTCAGTTT CCTTTGTAGG 1080 

ATTTCTGATG AAGAATTTTA TTGCTGAGTT CAGAATGGAA AATTGCACTT CTAGCGTCTC 1140 

ATTAAACATG TTTGAGGAAA AAACTCTAAA TAACTCCAGG TAGTTTGGAA TTACATCCGA 1200 

ATATTGCGTT ATTATCGAGA TCATAGCGTT TTTTGATTCA GGTTCCTGTA CAACTTCAGT 1260 

GTGTTTGACT AGTTCTGTTA CGTTTGCTTT AAAATTATTG GGATATTTCC TCAAAATATT 1320 

TCTGAAAACC GAAATAATCT CCTGGACGAC ATAATCAACA COGAATTCTA ACAAATCTAG 1380 

TAGCACAGCG ACACAATCGT GTACAGAGTC TTCATCTAGC TTAACAGCGA GATTACCAAT 1440 

GGCTCTGACT GATTTCCTTG ACATTTGAAT ATCAATATCT GTAGCATATT GTTCCAACTC 1500 

TTCTAGAATT CTTGGTAATG TTTCCTTGTT AGCTAAAAGA TATAAACACT CTAATTTCGT 1560 

GTCTTTGATG TATATGGGGT CATTGTACTC GATGAAAAAA TACGAAATGT CTAGCCTGAG 1620 

TAGAGATGAC TCCCTACTCA ATAAAAGAAG AATAACGTTT CTTAATACTA AAAATTGTAA 1680 

TTCAGGCGGC TTATCTAACA AAGCTATTAC AGAGTTAGAT AGCTTTTCGG CTAGAGTTTC 1740 

TTTGATGACG TCAACATAAT TCAACAAGTA CATGATGAAT TTTAAAGAGT TCAACACTAC 1800 



GTATGTGTTT ACTTGTTGCA GGTACGGTAA AGCTAGTTCG ATCATTTCAT GGGTATCCAA 1860 
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ATAATGCTGC GGCACAACCG AAGTCGTCAA AACTTCCAAA ACAGTAGCCT TATTCCACTC 1920 

ATTTAATTCG GGTAAAAGTT CTAGCATGTC AAAAGCGAGT TCCAAGGGAA TCCTGAAGGT 1980 

TCCATGTTAG CGTTTTTTTC GTGAATGGAA TATAAAGTAT GTAATGCAGC TACAATGACT 2040 

TCTGGAGAGC TCGACTGTGC CTTTACAATG TCATGTAGAA TGCTTGATAA CCCCAATACC 2100 

CTTTCATGAT CAATTTCATC TAAATCCAAC AGTGCGTAAA TTGCTGTCCT CGTCACTTGT 2160 

TCAGGTGGAG ACTTGTGATT TACCAATGAA ATGATACAGT CGAAGGCCTG ATCAGATAGC 2220 

TCTTTCACOG GGACTAATAC CAGAGTTCTT AGTGCCATTA TTTGTAACTT TTCATCTCTG 2280 

CTTTTGAAAT CGTCCATTAT AAATGGCAAA GCCTCTCTGG CCTGCTGAGG TTTTAATGCG 2340 

CCGATCACCC TAATATACTC ATGGCAAATT CTTTTCACTT CTAGATCATC TTCAATTTGC 2400 

CAAAATTTCA AGAGCTCAGA AAACAGAAGG GACATTTCGC CATAGTTTCC TAGAACCAAA 2460 

TTGGCGATAA TTTTTCTCAG AGCATTTTTC CTTCTTGTTA TATTCGATTT AAACTTTTTT 2520 

ACTCCAAAAT GTTGCAGATC TGTGACGATT TCATTTGCTT TATATCTGGC AAAAACTTTT 2580 

TGATCGGACA TAAGCGAAAT ACGTCCTATT AATGAAGTGA ATGTTCTTGC TGTATTCCCT 2640 

TCTTGTGCAG TAGATTAATT CTGTTTCCAG GCTGCGATAC TTTGATACCC AATACTAAAA 2700 

GTTGATGATT TGAACGAt^CT CCTATTTCCT CGCACATTTT TGGAGCGATA CCCGGAAGAC 2760 

AGAATCGCGA TGTTAAGAAA ATAGTTCTGA TGGCACTAAA GAGATCATGA TTAAGGAAAG 2820 

GTAAGTGATA TGCATGAATG GGAATAGGCT TTCGAACTTG ACGATTTAGT TCCTTATTTC 2860 

TATCCATCTA ATCCTCCAAC TTCAATAGGC CTTATCTAGC TCAGAGCAGT ATTTAATTGA 2940 

GAATAGTAGC TTAATTGAAA CCTTACTAAA AAAGTGTATG GTTACATAAG ATAAGGOGTT 3000 

AAGAAGAGTA TACATATGCA TTATTCATTA CCAAGACCAC TATGAATAGT AATACCATAT 3060 

TTAGCTTTTG AAACTCATGT TTTCTATTGT GTTGTTTCAA ATTCCTCTGT TAGGCTCAAT 3120 

TTAGGTTAAT TAAATTATAA AAAAATATAA AAAATAAAGA AAGTTTATCC ATCGGCACCT 3180 

CAATTCAATG GAGTAAACAG TTTCAACACT GAGTGGTGAA ACATTGAACA ACTACATGCA 3240 
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GTTTCCCGCC ACGAGGCAAG TGTAGGTCCT TTGTCCATTT CGCTTTGTTT TGCAGGTCAT 3300 

TGATGACCTA ATTAGGAAGG TAGAAGCCGC TCCAGCTCAA TAAGGAAATG CTAAGGGTAC 3360 

TCGCCTTTGG TGTTTTACCA TACAATGGCA GCTTTATGTC ACTTCATTCT TCAGTAACGG 3420 

CGCTTAAATA TTCCCAAAAA CGTTACAATG GAATTGTTTG ATCATGTAAC GAAATGCAAT 3480 

CTTCTAAAAA AAAAGCCATG TGAATCAAAA AAAGATTCCT TTTAGCATAC TATAAATATG 3540 

CAAAATGCCC TCTATTTATT CTAGTAATCG TCCATTCTCA TATCTTCCTT ATATCAGTCG 3600 

CCTCGCTTAA TATAGTCAGC ACAAAAGGAA CAACAATTCG CCAGTTTTCA AA ATG 3655 



Met 



1 



TTC ACT TTT CTC AAA ATT ATT CTG TGG CTT TTT TCC TTG GCA TTG GCC 
Phe Thr Phe Leu Lys He He Leu Trp Leu Phe Ser Leu Ala Leu Ala 

5 10 15 



3703 



TCT GCT ATA AAT ATC AAC GAT ATC ACA TTT TCC AAT TTA GAA ATT ACT 
Ser Ala He Asn He Asn Asp He Thr Phe Ser Asn Leu Glu He Thr 
20 25 30 



3751 



CCA CTG ACT GCA AAT AAA CAA CCT GAT CAA GGT TGG ACT GCC ACT TTT 
Pro Leu Thr Ala Asn Lys Gin Pro Asp Gin Gly Trp Thr Ala Thr Phe 
35 40 45 



3799 



GAT TTT AGT ATT GCA GAT GCG TCT TCC ATT AGG GAG GGC GAT GAA TTC 
Asp Phe Ser He Ala. Asp Ala Ser Ser He Arg Glu Gly Asp Glu Phe 
50 55 60 65 



3847 



ACA TTA TCA ATG CCA CAT GTT TAT AGG ATT AAG CTA TTA AAC TCA TCG 
Thr Leu Ser Met Pro His Val Tyr Arg He Lys Leu Leu Asn Ser Ser 

70 75 80 



3895 



CAA ACA GCT ACT ATT TCC TTA GCG GAT GGT ACT GAG GCT TTC AAA TGC 
Gin Thr Ala Thr He Ser Leu Ala Asp Gly Thr Glu Ala Phe Lys Cys 
85 90 95 



3943 



TAT GTT TCG CAA GAG GCT GCA TAC 
Tyr Val Ser Gin Gin Ala Ala Tyr 
100 105 



TTG TAT GAA AAT ACT ACT TTC ACA 
Leu Tyr Glu Asn Thr Thr Phe Thr 

110 



3991 
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TGT ACT GCT CAA AAT GAC CTG TCC TCC TAT AAT ACG ATT GAT GGA TCC 4039 
Cys Thr Ala Gin Asn Asp Leu Ser Ser Tyr Asn Thr lie Asp Gly Ser 
115 120 125 

ATA ACA TTT TCG CTA AAT TTT AGT GAT GGT GGT TCC AGC TAT GAA TAT 4087 
lie Thr Phe Ser Leu Asn Phe Ser Asp Gly Gly Ser Ser Tyr Glu Tyr 
130 135 140 145 

GAG TTA GAA AAC GCT AAG TTT TTC AAA TCT GGG CCA ATG CTT GTT AAA 4135 
Glu Leu Glu Asn Ala Lys Phe Phe Lys Ser Gly Pro Met Leu Val Lys 

150 155 160 

CTT GGT AAT CAA ATG TCA GAT GTG GTG AAT TTC GAT CCT GCT GCT TTT 4183 
Leu Gly Asn Gin Met Ser Asp Val Val Asn Phe Asp Pro Ala Ala Phe 
165 170 175 

ACA GAG AAT GTT TTT CAC TCT GGG CGT TCA ACT GGT TAC GGT TCT TTT 4231 
Thr Glu Asn Val Phe His Ser Gly Arg Ser Thr Gly Tyr Gly Ser Phe 
180 185 190 

GAA AGT TAT CAT TTG GGT ATG TAT TGT CCA AAC GGA TAT TTC CTG GGT 4279 
Glu Ser Tyr His Leu Gly Met Tyr Cys Pro Asn Gly Tyr Phe Leu Gly 
195 200 205 

GGT ACT GAG AAG ATT GAT TAC GAC AGT TCC AAT AAC AAT GTC GAT TTG 4327 
Gly Thr Glu Lys lie Asp Tyr Asp Ser Ser Asn Asn Asn Val Asp Leu 
210 215 220 225 

GAT TGT TCT TCA GTT CAG GTT TAT TCA TCC AAT GAT TTT AAT GAT TGG 4375 
Asp Cys Ser Ser Val Gin Val Tyr Ser Ser Asn Asp Phe Asn Asp Trp 

230 235 240 

TGG TTC CCG CAA AGT TAC AAT GAT ACC AAT GCT GAC GTC ACT TGT TTT 4423 
Trp Phe Pro Gin Ser Tyr Asn Asp Thr Asn Ala Asp Val Thr Cys Phe 
245 250 255 

GGT AGT AAT CTG TGG ATT ACA CTT GAC GAA AAA CTA TAT GAT GGG GAA 4471 
Gly Ser Asn Leu Trp lie Thr Leu Asp Glu Lys Leu Tyr Asp Gly Glu 
260 265 270 

ATG TTA TGG GTT AAT GCA TTA CAA TCT CTA CCC GCT AAT GTA AAC ACA 4519 
Met Leu Trp Val Asn Ala Leu Gin Ser Leu Pro Ala Asn Val Asn Thr 
275 280 285 
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ATA GAT CAT GCG TTA GAA TTT CAA TAC ACA TGC CTT GAT ACC ATA GCA 4567 
lie Asp His Ala Leu Glu Phe Gin Tyr Thr Cys Leu Asp Thr lie Ala 
290 295 300 305 

AAT ACT ACG TAC GCT ACG CAA TTC TCG ACT ACT AGG GAA TTT ATT GTT 4615 
Asn Thr Thr Tyr Ala Thr Gin Phe Ser Thr Thr Arg Glu Phe lie Val 

310 315 320 

TAT CAG GGT CGG AAC CTC GGT ACA GCT AGC GCC AAA AGC TCT TTT ATC 4663 
Tyr Gin Gly Arg Asn Leu Gly Thr Ala ser Ala Lys Ser Ser Phe lie 
325 330 335 

TCA ACC ACT ACT ACT GAT TTA ACA ACT ATA AAC ACT AGT GCG TAT TCC 4711 
Ser Thr Thr Thr Thr Asp Leu Thr Ser lie Asn Thr Ser Ala Tyr Ser 
340 345 350 

ACT GGA TCC ATT TCC ACA GTA GAA ACA GGC AAT CGA ACT ACA TCA GAA 4759 
Thr Gly Ser lie Ser Thr Val Glu Thr Gly Asn Arg Thr Thr Ser Glu 
355 360 365 

GTG ATC AGT CAT GTG GTG ACT ACC AGC ACA AAA CTG TCT CCA ACT GCT 4807 
Val He Ser His Val Val Thr Thr Ser Thr Lys Leu Ser Pro Thr Ala 
370 375 380 385 

ACT ACC AGC CTG ACA ATT GCA CAA ACC AGT ATC TAT TCT ACT GAC TCA 4855 
Thr Thr Ser Leu Thr He Ala Gin Thr Ser He Tyr Ser Thr Asp Ser 

390 395 400 

AAT ATC ACA GTA GGA ACA GAT ATT CAC ACC ACA TCA GAA GTG ATT AGT 4903 
Asn He Thr Val Gly Thr Asp He His Thr Thr Ser Glu Val He Ser 
405 410 415 

GAT GTG GAA ACC ATT AGC AGA GAA ACA GCT TCG ACC GTT GTA GCC GCT 4951 
Asp Val Glu Thr He Ser Arg Glu Thr Ala Ser Thr Val Val Ala Ala 
420 425 430 

CCA ACC TCA ACA ACT GGA TGG ACA GGC GCT ATG AAT ACT TAC ATC CCG 4999 
Pro Thr Ser Thr Thr Gly Trp Thr Gly Ala Met Asn Thr Tyr He Pro 
435 . 440 445 

CAA TTT ACA TCC TCT TCT TTC GCA ACA ATC AAC AGC ACA CCA ATA ATC 5047 
Gin Phe Thr Ser Ser Ser Phe Ala Thr He Asn Ser Thr Pro He He 
450 455 460 465 
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TCT TCA TCA GCA GTA TTT GAA ACC TCA GAT GCT TCA ATT GTC AAT GTG 5095 
Ser Ser Ser Ala Val Phe Glu Thr Ser Asp Ala Ser lie Val Asn Val 

470 475 480 

CAC ACT GAA AAT ATC ACG AAT ACT GCT GCT GTT CCA TCT GAA GAG CCC 5143 
His Thr Glu Asn lie Thr Asn Thr Ala Ala Val Pro Ser Glu Glu Pro 
485 490 495 

ACT TTT GTA AAT GCC ACG AGA AAC TCC TTA AAT TCC TTC TGC AGC AGC 5191 
Thr Phe Val Asn Ala Thr Arg Asn Ser Leu Asn Ser Phe Cys Ser Ser 
500 505 510 

AAA CAG CCA TCC AGT CCC TCA TCT TAT ACG TCT TCC CCA CTC GTA TCG 5239 
Lys Gin Pro Ser Ser Pro Ser Ser Tyr Thr Ser Ser Pro Leu Val Ser 
515 520 525 

TCC CTC TCC GTA AGC AAA ACA TTA CTA AGC ACC AGT TTT ACG CCT TCT 5287 
Ser Leu Ser Val Ser Lys Thr Leu Leu Ser Thr Ser Phe Thr Pro Ser 
530 535 540 545 

GTG CCA ACA TCT AAT ACA TAT ATC AAA ACG GAA AAT ACG GGT TAC TTT 5335 
Val Pro Thr Ser Asn Thr Tyr He Lys Thr Glu Asn Thr Gly Tyr Phe 

550 555 560 

GAG CAC ACG GCT TTG ACA ACA TCT TCA GTT GGC CTT AAT TCT TTT AGT 5383 
Glu His Thr Ala Leu Thr Thr Ser Ser Val Gly Leu Asn Ser Phe Ser 
565 570 575 

GAA ACA GCA CTC TCA^TCT CAG GGA ACG AAA ATT GAC ACC TTT TTA GTG 5431 
Glu Thr Ala Leu Ser Ser Gin Gly Thr Lys He Asp Thr Phe Leu Val 
580 585 590 

TCA TCC TTG ATC GCA TAT CCT TCT TCT GCA TCA GGA AGC CAA TTG TCC 5479 
Ser Ser Leu He Ala Tyr Pro Ser Ser Ala Ser Gly Ser Gin Leu Ser 
595 600 605 

GGT ATC CAA CAG AAT TTC ACA TCA ACT TCT CTC ATG ATT TCA ACC TAT 5527 
Gly He Gin Gin Asn Phe Thr Ser Thr Ser Leu Met He Ser Thr Tyr 
610 615 620 625 

GAA GGT AAA GCG TCT ATA TTT TTC TCA GCT GAG CTC GGT TCG ATC ATT 5575 
Glu Gly Lys Ala Ser He Phe Phe Ser Ala Glu Leu Gly Ser He He 

630 635 640 
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TTT CTG CTT TTG TCG TAC CTG CTA TTC TAAAACGGGT ACTGTACAGT 5622 
Phe Leu Leu Leu Ser Tyr Leu Leu Phe 
645 650 

TAGTACATTG AGTCGAAATA TACGAAATTA TTGTTCATAA TTTTCATCCT GGCTCTTTTT 5682 

TTCTTCAACC ATAGTTAAAT GGACAGTTCA TATCTTAAAC TCTAATAATA CTTTTCTAGT 5742 

TCTTATCCTT TTCCGTCTCA CCGCAGATTT TATCATAGTA TTAAATTTAT ATTTTGTTCG 5802 

TAAAAAGAAA AATTTGTGAG CGTTACCGCT CGTTTCATTA CCCGAAGGCT GTTTCAGTAG 5862 

ACCACTGATT AAGTAAGTAG ATGAAAAAAT TTCATCACCA TGAAAGAGTT CGATGAGAGC 5922 

TACTTTTTCA AATGCTTAAC AGCTAACCGC CATTCAATAA TGTTACGTTC TCTTCATTCT 5982 

GCGGCTACGT TATCTAACAA GAGGTTTTAC TCTCTCATAT CTCATTCAAA TAGAAAGAAC 6042 

ATAATCAAAA AG CTT 6057 

(2) INFORMATION FOR SEQ ID MO: 2i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 650 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Phe Thr Phe Leu Lys He He Leu Trp Leu Phe Ser Leu Ala Leu 
15 10 15 

Ala Ser Ala He Asn He Asn Asp He Thr Phe Ser Asn Leu Glu He 
20 25 30 

Thr Pro Leu Thr Ala Asn Lys Gin Pro Asp Gin Gly Trp Thr Ala Thr 
35 ' • 40 45 

Phe Asp Phe Ser He Ala Asp Ala Ser Ser He Arg Glu Gly Asp Glu 
50 55 60 
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Phe Thr Leu Ser Met Pro His Val Tyr Arg lie Lys Leu Leu Asn Ser 
65 70 75 80 

Ser Gin Thr Ala Thr He Ser Leu Ala Asp Gly Thr Glu Ala Phe Lys 

85 90 95 

Cys Tyr Val Ser Gin Gin Ala Ala Tyr Leu Tyr Glu Asn Thr Thr Phe 
100 105 110 

Thr Cys Thr Ala Gin Asn Asp Leu Ser Ser Tyr Asn Thr He Asp Gly 
115 120 125 

Ser He Thr Phe Ser Leu Asn Phe Ser Asp Gly Gly Ser Ser Tyr Glu 
130 135 140 

Tyr Glu Leu Glu Asn Ala Lys Phe Phe Lys Ser Gly Pro Met Leu Val 
I 45 150 155 160 

Lys Leu Gly Asn Gin Met Ser Asp Val Val Asn Phe Asp Pro Ala Ala 

165 170 175 

Phe Thr Glu Asn Val Phe His Ser Gly Arg Ser Thr Gly Tyr Gly Ser 
180 185 190 

Phe Glu Ser Tyr His Leu Gly Met Tyr Cys Pro Asn Gly Tyr Phe Leu 
195 200 205 

Gly Gly Thr Glu Lys He Asp Tyr Asp Ser Ser Asn Asn Asn Val Asp 
21° „ . 215 220 

Leu Asp Cys Ser Ser Val Gin Val Tyr Ser Ser Asn Asp Phe Asn Asp 
225 230 235 240 

Trp Trp Phe Pro Gin Ser Tyr Asn Asp Thr Asn Ala Asp Val Thr Cys 

245 250 255 

Phe Gly Ser Asn Leu Trp He Thr Leu Asp Glu Lys Leu Tyr Asp Gly 
260 265 270 



Glu Met Leu Trp Val Ash' Ala Leu Gin Ser Leu Pro Ala Asn Val Asn 
275 280 285 



Thr He Asp His Ala Leu Glu Phe Gin Tyr Thr Cys Leu Asp Thr He 
290 295 300 
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Ala Asn Thr Thr Tyr Ala Thr Gin Phe Ser Thr Thr Arg Glu Phe lie 
305 310 315 320 

Val Tyr Gin Gly Arg Asn Leu Gly Thr Ala Ser Ala Lys ser Ser Phe 

325 330 335 

He Ser Thr Thr Thr Thr Asp Leu Thr Ser He Asn Thr Ser Ala Tyr 
340 345 350 

Ser Thr Gly Ser He Ser Thr Val Glu Thr Gly Asn Arg Thr Thr Ser 
355 360 365 

Glu Val He Ser His Val Val Thr Thr Ser Thr Lys Leu Ser Pro Thr 
370 375 380 

Ala Thr Thr Ser Leu Thr He Ala Gin Thr Ser He Tyr Ser Thr Asp 
385 390 395 400 

Ser Asn He Thr Val Gly Thr Asp He His Thr Thr Ser Glu Val He 

405 410 415 

Ser Asp Val Glu Thr He Ser Arg Glu Thr Ala Ser Thr Val Val Ala 
420 425 430 

Ala Pro Thr Ser Thr Thr Gly Trp Thr Gly Ala Met Asn Thr Tyr He 
435 440 445 

Pro Gin Phe Thr Ser Ser Ser Phe Ala Thr He Asn Ser Thr Pro He 
450 „ 455 460 

He Ser Ser Ser Ala Val Phe Glu Thr Ser Asp Ala Ser He Val Asn 
465 470 475 480 

Val His Thr Glu Asn He Thr Asn Thr Ala Ala Val Pro Ser Glu Glu 

485 490 495 

Pro Thr Phe Val Asn Ala Thr Arg Asn Ser Leu Asn Ser Phe Cys Ser 
500 505 510 



Ser Lys Gin Pro Ser Ser Pro Ser Ser Tyr Thr Ser Ser Pro Leu Val 
515 520 525 



Ser Ser Leu Ser Val Ser Lys Thr Leu Leu Ser Thr Ser Phe Thr Pro 
530 535 540 
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Ser Val Pro Thr Ser Asn Thr Tyr He Lys Thr Glu Asn Thr Gly Tyr 
545 550 555 560 

Phe Glu His Thr Ala Leu Thr Thr Ser Ser Val Gly Leu Asn Ser Phe 

565 570 575 

Ser Glu Thr Ala Leu Ser Ser Gin Gly Thr Lys He Asp Thr Phe Leu 
580 585 590 

Val ser Ser Leu He Ala Tyr Pro Ser Ser Ala Ser Gly Ser Gin Leu 
595 600 605 

Ser Gly He Gin Gin Asn Phe Thr Ser Thr Ser Leu Met He Ser Thr 
610 615 620 

Tyr Glu Gly Lys Ala Ser He Phe Phe Ser Ala Glu Leu Gly Ser He 
625 630 635 640 



He Phe Leu Leu Leu Ser Tyr Leu Leu Phe 

645 650 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

' <C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipol 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



GGGGCGGCCG AGGTCTCGCA AGATCTGGA 



29 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part non-coding strand lipase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TTTGTCCAGG TCTTGCGAGA CCTCTCGACG AAT 33 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

1 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part coding strand lipase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TTCGGGTTAA TTGGGACATG TCTTTAGTGC GA 32 



<2) INFORMATION FOR SEQ ID NO: 6: 

{ i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 40 'base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipo2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CCCCAAGCTT AAGGCTAGCA AGACATGTCC CAATTAACCC 40 



(2) INFORMATION FOR SEQ ID NO: 7: 

* 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 894 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Humicola lanuginosa 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 72 -.884 

(D) OTHER INFORMATION: /product= "lipase" 

(ix) FEATURE: 

• (A) NAME/KEY: mat_peptide 
(B) LOCATION: 72.. 881 

(D) OTHER INFORMATION: /product* "lipase" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GAATTCGTAG CGACGATATG AGGAGCTCCC TTGTGCTGTT CTTTGTCTCT GCGTGGACGG 60 

CCTTGGCCAC G GCC GAG GTC TCG CAA GAT CTG TTT AAC CAG TTC AAT CTC 110 
Ala Glu Val Ser Gin Asp Leu Phe Asn Gin Phe Asn Leu 
15 10 

TTT GCA CAG TAT TCT GCT GCC GCA TAC TGC GGA AAA AAC AAT GAT GCC 158 
Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn Asp Ala 
15 20 25 
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CCA GCT GGT ACA AAC ATT ACG TGC ACG GGA AAT GCC TGC CCC GAG GTA 206 
Pro Ala Gly Thr Asn He Thr Cys Thr Gly Asn Ala Cys Pro Glu Val 
30 35 40 45 

GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG TTT GAA GAC TCT GGA GTG 254 
Glu Lys Ala Asp Ala Thr Phe Leu Tyr Ser Phe Glu Asp Ser Gly Val 

50 55 60 

GGC GAT GTC ACC GGC TTC CTT GCT CTA GAC AAC ACG AAC AAA TTG ATC 302 
Gly Asp Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys Leu He 
65 70 75 



GTC CTC TCT TTC CGT GGC TCT CGT TCC ATA GAA AAC TGG ATC GGA AAT 350 
Val Leu Ser Phe Arg Gly Ser Arg Ser He Glu Asn Trp He Gly Asn 
80 85 90 



CTT AAC TTC GAC TTG AAA GAA ATA AAT GAC ATT TGC TCC GGC TGC AGG 398 
Leu Asn Phe Asp Leu Lys Glu He Asn Asp He Cys Ser Gly Cys Arg 
95 100 105 

GGA CAT GAC GGC TTC ACC TCG AGC TGG AGG TCT GTA GCC GAT ACG TTA 446 
Gly His Asp Gly Phe Thr Ser Ser Trp Arg Ser Val Ala Asp Thr Leu 
HO 115 120 125 

AGG CAG AAG GTG GAG GAT GCT GTG AGG GAG CAT CCC GAC TAT CGC GTG 494 
Arg Gin Lys Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr Arg Val 

130 135 140 

GTG TTT ACC GGA CAT AQC TTG GGT GGT GCA TTG GCA ACT GTT GCC GGA 542 
Val Phe Thr Gly His Ser Leu Gly Gly Ala Leu Ala Thr Val Ala Gly 
145 150 155 

GCA GAC CTG CGT GGA AAT GGG TAT GAC ATC GAC GTG TTT TCA TAT GGC 590 
Ala Asp Leu Arg Gly Asn Gly Tyr Asp He Asp Val Phe Ser Tyr Gly 
160 165 170 



GCC CCC CGA GTC GGA AAC AGG GCT TTT GCA GAA TTC CTG ACC GTA CAG 638 
Ala Pro Arg Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr Val Gin 
175 180 185 

ACC GGC GGT ACC CTC TAC CGC ATT ACC CAC ACC AAT GAT ATT GTC CCT 686 
Thr Gly Gly Thr Leu Tyr Arg He Thr His Thr Asn Asp He Val Pro 
190 195 200 205 
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AGA CTC CCG CCG CGC GAG TTC GGT TAC AGO CAT TCT AGC CCA GAG TAC 734 
Arg Leu Pro Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro Glu Tyr 

210 215 220 

TGG ATC AAA TCT GGA ACC CTT GTC CCC GTC ACC CGA AAC GAC ATC GTG 782 
Trp lie Lys Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp He Val 
225 230 235 

AAG ATA GAA GGC ATC GAT GCC ACC GGC GGC AAT AAC CAG CCT AAC ATT 830 
Lys He Glu Gly He Asp Ala Thr Gly Gly Asn Asn Gin Pro Asn He 
240 245 250 

CCG GAT ATC CCT GCG CAC CTA TGG TAC TTC GGG TTA ATT GGG ACA TGT 878 
Pro Asp He Pro Ala His Leu Trp Tyr Phe Gly Leu He Gly Thr Cys 
255 260 265 

CTT TAGTGCGAAG CTT 894 

Leu 

270 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 270 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Ala Glu Val Ser Gin Asp Leu Phe Asn Gin Phe Asn Leu Phe Ala Gin 
1 5 10 15 

Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn Asp Ala Pro Ala Gly 
20 25 30 

Thr Asn He Thr Cys Thr Gly Asn Ala Cys Pro Glu Val Glu Lys Ala 
35 40 45 

Asp Ala Thr Phe Leu Tyr Ser Phe Glu Asp Ser Gly Val Gly Asp Val 
50 55 60 
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Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys Leu lie Val Leu Ser 
65 70 75 80 

Phe Arg Gly Ser Arg Ser He Glu Asn Trp He Gly Asn Leu Asn Phe 

85 90 95 

Asp Leu Lys Glu He Asn Asp He Cys Ser Gly Cys Arg Gly His Asp 
100 105 110 

Gly Phe Thr Ser Ser Trp Arg Ser Val Ala Asp Thr Leu Arg Gin Lys 
115 120 125 

Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr Arg Val Val Phe Thr 
130 135 140 

Gly His Ser Leu Gly Gly Ala Leu Ala Thr Val Ala Gly Ala Asp Leu 
145 150 155 160 

Arg Gly Asn Gly Tyr Asp He Asp Val Phe Ser Tyr Gly Ala Pro Arg 

165 170 175 

Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr Val Gin Thr Gly Gly 
180 185 190 

Thr Leu Tyr Arg He Thr His Thr Asn Asp He Val Pro Arg Leu Pro 
195 200 205 

Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro Glu Tyr Trp He Lys 
210 m 215 220 

Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp He Val Lys He Glu 
225 230 235 240 



Gly He Asp Ala Thr Gly Gly Asn Asn Gin Pro Asn He Pro Asp He 

245 250 255 



Pro Ala His Leu Trp Tyr Phe Gly Leu He Gly Thr Cys Leu 
260 265 270 



WO 94/01567 



48 



PCT/EP93/01763 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATCCCTGCGC ACCTATGGTA CTTCGGGTTA ATTGGGACAT GTCTTGCTAG CCTTA 55 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE^ SOURCE : 
(B) CLONE: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGCTTAAGGC TAGCAAGACA TGTCCCAATT AACCCGAAGT ACCATAGGTG CGCAGGGAT 59 



(2) INFORMATION FOR SEQ ID NO: 11: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH :• 1 1828 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA to mRNA 
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<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Geotrichum candidum 
<B) STRAIN: CMICC 335426 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 40.. 1731 

(D) OTHER INFORMATION: /product^ "lipase" 

( ix ) FEATURE : 

(A) NAME/KEY: sig_jpeptide 

(B) LOCATION: 40.. 96 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 97.. 1728 

(D) OTHER INFORMATION: /product^ "lipase" 
/gene= «ii P B" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

AATTCGGCAC GAGATTCCTT TGATTTGCAA CTGTTAATC ATG GTT TCC AAA AGC 54 

Met Val Ser Lys Ser 
-19 -15 



TTT TTT TTG GCT GCG GCG CTC AAC GTA GTG GGC ACC TTG GCC CAG GCC 
Phe Phe Leu Ala Ala Ala Leu Asn Val Val Gly Thr Leu Ala Gin Ala 

-10 -5 1 



102 



CCC ACG GCC GTT CTT AAT GGC AAC GAG GTC ATC TCT GGT GTC CTT GAG 
Pro Thr Ala Val Leu Asn Gly Asn Glu Val He Ser Gly Val Leu Glu 
5 10 15 



150 



GGC AAG GTT GAT ACC TTC AAG GGA ATC CCA TTT GCT GAC CCT CCT GTT 
Gly Lys Val Asp Thr Phe Lys Gly He Pro Phe Ala Asp Pro Pro Val 
20 25 30 



198 



GGT GAC TTG CGG TTC AAG CAC CCC CAG CCT TTC ACT GGA TCC TAC CAG 
Gly Asp Leu Arg Phe Lys His Pro Gin Pro Phe Thr Gly Ser Tyr Gin 
35 40 45 50 



246 



GGT CTT AAG GCC AAC GAC TTC AGC TCT GCT TGT ATG CAG CTT GAT CCT 
Gly Leu Lys Ala Asn Asp Phe Ser Ser Ala Cys Met Gin Leu Asp Pro 

55 60 65 



294 
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GGC AAT GCC TTT TCT TTG CTT GAC AAA GTA GTG GGC TTG GGA AAG ATT 
Gly Asn Ala Phe Ser Leu Leu Asp Lys Val Val Gly Leu Gly Lys lie 
70 75 80 



342 



CTT CCT GAT AAC CTT AGA GGC CCT CTT TAT GAC ATG GCC CAG GGT AGT 
Leu Pro Asp Asn Leu Arg Gly Pro Leu Tyr Asp Met Ala Gin Gly Ser 
85 90 95 



390 



GTC TCC ATG AAT GAG GAC TGT CTC TAC CTT AAC GTT TTC CGC CCC GCT 
Val Ser Met Asn Glu Asp Cys Leu Tyr Leu Asn Val Phe Arg Pro Ala 
100 105 110 



438 



GGC ACC AAG CCT GAT GCT AAG CTC CCC GTC ATG GTT TGG ATT TAC GGT 
Gly Thr Lys Pro Asp Ala Lys Leu Pro Val Met Val Trp He Tyr Gly 
115 120 125 130 



486 



GGT GCC TTT GTG TTT GGT TCT TCT GCT TCT TAC CCT GGT AAC GGC TAC 
Gly Ala Phe Val Phe Gly Ser Ser Ala Ser Tyr Pro Gly Asn Gly Tyr 

135 140 145 



534 



GTC AAG GAG AGT GTG GAA ATG GGC CAG CCT GTT GTG TTT GTT TCC ATC 
Val Lys Glu Ser Val Glu Met Gly Gin Pro Val Val Phe Val Ser He 
150 155 160 



582 



AAC TAC CGT ACC GGC CCC TAT GGA TTC TTG GGT GGT GAT GCC ATC ACC 
Asn Tyr Arg Thr Gly Pro Tyr Gly Phe Leu Gly Gly Asp Ala He Thr 
165 170 175 



630 



GCT GAG GGC AAC ACC AAC GCT GGT CTG CAC GAC CAG CGC AAG GGT CTC 
Ala Glu Gly Asn Thr Asn Ala Gly Leu His Asp Gin Arg Lys Gly Leu 
180 185 190 



678 



GAG TGG GTT AGC GAC AAC ATT GCC AAC TTT GGT GGT GAT CCC GAC AAG 
Glu Trp Val Ser Asp Asn He Ala Asn Phe Gly Gly Asp Pro Asp Lys 
195 200 205 210 



726 



GTC ATG ATT TTC GGT GAG TCC GCT GGT GCC ATG AGT GTT GCT CAC CAG 
Val Met He Phe Gly Glu Ser Ala Gly Ala Met Ser Val Ala His Gin 

215 220 225 



774 



CTT GTT GCC TAC GGT GGT GAC AAC ACC TAC AAC GGA AAG CAG CTT TTC 
Leu Val Ala Tyr Gly Gly Asp Asn Thr Tyr Asn Gly Lys Gin Leu Phe 
230 235 240 



822 
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CAC TCT GCC ATT CTT CAG TCT GGC GGT CCT CTT CCT TAC TTT GAC TCT 870 
His Ser Ala lie Leu Gin Ser Gly Gly Pro Leu Pro Tyr Phe Asp Ser 
245 250 255 

ACT TCT GTT GGT CCC GAG AGT GCC TAC AGC AGA TTT GCT CAG TAT GCC 918 
Thr Ser Val Gly Pro Glu Ser Ala Tyr Ser Arg Phe Ala Gin Tyr Ala 
260 265 270 

GGA TGT GAC ACC AGT GCC AGT GAT AAT GAC ACT CTG GCT TGT CTC CGC 966 
Gly Cys Asp Thr Ser Ala Ser Asp Asn Asp Thr Leu Ala Cys Leu Arg 
275 280 285 290 

AGC AAG TCC AGC GAT GTC TTG CAC AGT GCG CAG AAC TOG TAT GAT CTT 1014 
Ser Lys Ser Ser Asp Val Leu His Ser Ala Gin Asn Ser Tyr Asp Leu 

295 300 305 

AAG GAC CTG TTT GGT CTG CTC CCT CAA TTC CTT GGA TTT GGT CCC AGA 1062 
Lys Asp Leu Phe Gly Leu Leu Pro Gin Phe Leu Gly Phe Gly Pro Arg 
310 315 320 

CCC GAC GGC AAC ATT ATT CCC GAT GCC GCT TAT GAG CTC TAC CGC AGC 1110 
Pro Asp Gly Asn lie lie Pro Asp Ala Ala Tyr Glu Leu Tyr Arg Ser 
325 330 335 

GGT AGA TAC GCC AAG GTT CCC TAC ATT ACT GGC AAC CAG GAG GAT GAG 1158 
Gly Arg Tyr Ala Lys Val Pro Tyr He Thr Gly Asn Gin Glu Asp Glu 
340 345 350 

GGT ACT ATT CTT GCC CCC GTT GCT ATT AAT GCT ACC ACT ACT CCC CAT 1206 
Gly Thr He Leu Ala Pro Val Ala He Asn Ala Thr Thr Thr Pro His 
355 360 365 370 

GTT AAG AAG TGG TTG AAG TAC ATT TGT AGC CAG GCT TCT GAC GCT TCG 1254 
Val Lys Lys Trp Leu Lys Tyr He Cys Ser Gin Ala Ser Asp Ala Ser 

375 380 385 

CTT GAT CGT GTT TTG TCG CTC TAC CCC GGC TCT TGG TCG GAG GGT TCA 1302 
Leu Asp Arg Val Leu Ser Leu Tyr Pro Gly Ser Trp Ser Glu Gly Ser 
390 395 400 

CCA TTC CGC ACT GGT ATT CTT AAT GCT CTT ACC CCT CAG TTC AAG CGC 1350 
Pro Phe Arg Thr Gly He Leu Asn Ala Leu Thr Pro Gin Phe Lys Arg 
405 410 415 
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ATT GCT GCC ATT TTC ACT GAT TTG CTG TTC CAG TCT CCT CGT CGT GTT 1398 
lie Ala Ala He Phe Thr Asp Leu Leu Phe Gin Ser Pro Arg Arg Val 
420 425 430 

ATG CTT AAC GCT ACC AAG GAC GTC AAC CGC TGG ACT TAC CTT GCC ACC 1446 
Met Leu Asn Ala Thr Lys Asp Val Asn Arg Trp Thr Tyr Leu Ala Thr 
435 440 445 450 

CAG CTC CAT AAC CTC GTT CCA TTT TTG GGT ACT TTC CAT GGC AGT GAT 1494 
Gin Leu His Asn Leu Val Pro Phe Leu Gly Thr Phe His Gly Ser Asp 

455 460 465 

CTT CTT TTT CAA TAC TAC GTG GAC CTT GGC CCA TCT TCT GCT TAC CGC 1542 
Leu Leu Phe Gin Tyr Tyr Val Asp Leu Gly Pro Ser Ser Ala Tyr Arg 
470 475 480 



CGC TAC TTT ATC TCG TTT GCC AAC GAC CAC GAC CCC AAC GTT GGT ACC 1590 
Arg Tyr Phe He Ser Phe Ala Asn His His Asp Pro Asn Val Gly Thr 
485 490 495 

AAC CTC CAA CAG TGG GAT ATG TAC ACT GAT GCA GGC AAG GAG ATG CTT 1638 
Asn Leu Gin Gin Trp Asp Met Tyr Thr Asp Ala Gly Lys Glu Met Leu 
500 505 510 

CAG ATT CAT ATG ATT GGT AAC TCT ATG AGA ACT GAC GAC TTT AGA ATC 1686 
Gin He His Met He Gly Asn Ser Met Arg Thr Asp Asp Phe Arg He 
515 520 525 530 

GAG GGA ATC TCG AAC TTT GAG TCT GAC GTT ACT CTC TTC GGT TAATCCCATT 1738 
Glu Gly He Ser Asn Phe Glu Ser Asp Val Thr Leu Phe Gly 

535 540 545 



TAGCAAGTTT TGTGTATTTC AAGTATACCA GTTGATGTAA TATATCAATA GATTACAAAT 1798 



TAATTAGTGA AAAAAAAAAA AAAAAAAAAC ( 1828 

(2) INFORMATION FOR SEQ ID NO: 12: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 563 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Val Ser Lys Ser Phe Phe Leu Ala Ala Ala Leu Asn Val Val Gly 
-19 -15 -10 -5 

Thr Leu Ala Gin Ala Pro Thr Ala Val Leu Asn Gly Asn Glu Val He 

15 10 

Ser Gly Val Leu Glu Gly Lys Val Asp Thr Phe Lys Gly He Pro Phe 
IS 20 25 

Ala Asp Pro Pro Val Gly Asp Leu Arg Phe Lys His Pro Gin Pro Phe 
30 35 40 45 

Thr Gly Ser Tyr Gin Gly Leu Lys Ala Asn Asp Phe Ser Ser Ala Cys 

50 55 60 

Met Gin Leu Asp Pro Gly Asn Ala Phe Ser Leu Leu Asp Lys Val Val 
65 70 75 

Gly Leu Gly Lys He Leu Pro Asp Asn Leu Arg Gly Pro Leu Tyr Asp 
80 85 90 

Met Ala Gin Gly Ser Val Ser Met Asn Glu Asp Cys Leu Tyr Leu Asn 
95 100 105 

Val Phe Arg Pro Ala Gly Thr Lys Pro Asp Ala Lys Leu Pro Val Met 
HO 115 120 125 

Val Trp He Tyr Gly Gly Ala Phe Val Phe Gly Ser Ser Ala Ser Tyr 

130 135 140 

Pro Gly Asn Gly Tyr Val Lys Glu Ser Val Glu Met Gly Gin Pro Val 
14S 150 155 

Val Phe Val Ser He Asn Tyr Arg Thr Gly Pro Tyr Gly Phe Leu Gly 
160 165 170 

Gly Asp Ala He Thr Ala Glu Gly Asn Thr Asn Ala Gly Leu His Asp 
175 .180 185 



Gin Arg Lys Gly Leu Glu Trp Val Ser Asp Asn He Ala Asn Phe Gly 
190 195 200 205 
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Gly Asp Pro Asp Lys Val Met lie Phe Gly Glu Ser Ala Gly Ala Met 

210 215 220 

Ser Val Ala His Gin Leu Val Ala Tyr Gly Gly Asp Asn Thr Tyr Asn 
225 230 235 

Gly Lys Gin Leu Phe His Ser Ala lie Leu Gin Ser Gly Gly Pro Leu 
240 245 250 

Pro Tyr Phe Asp Ser Thr Ser Val Gly Pro Glu Ser Ala Tyr Ser Arg 
255 260 265 

Phe Ala Gin Tyr Ala Gly Cys Asp Thr Ser Ala Ser Asp Asn Asp Thr 
270 275 280 285 

Leu Ala Cys Leu Arg Ser Lys Ser Ser Asp Val Leu His Ser Ala Gin 

290 295 300 

Asn Ser Tyr Asp Leu Lys Asp Leu Phe Gly Leu Leu Pro Gin Phe Leu 
305 310 315 

Gly Phe Gly Pro Arg Pro Asp Gly Asn lie He Pro Asp Ala Ala Tyr 
320 325 330 

Glu Leu Tyr Arg Ser Gly Arg Tyr Ala Lys Val Pro Tyr He Thr Gly 
335 340 345 

Asn Gin Glu Asp Glu Gly Thr He Leu Ala Pro Val Ala He Asn Ala 
350 ^ 355 360 365 

Thr Thr Thr Pro His Val Lys Lys Trp Leu Lys Tyr He Cys Ser Gin 

370 375 380 

Ala Ser Asp Ala Ser Leu Asp Arg Val Leu Ser Leu Tyr Pro Gly Ser 
385 390 395 

Trp Ser Glu Gly Ser Pro Phe Arg Thr Gly He Leu Asn Ala Leu Thr 
400 405 410 



Pro Gin Phe Lys Arg ile Ala Ala He Phe Thr Asp Leu Leu Phe Gin 
415 420 425 



Ser Pro Arg Arg Val Met Leu Asn Ala Thr Lys Asp Val Asn Arg Trp 
430 435 440 445 
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Thr Tyr Leu Ala Thr Gin Leu His Asn Leu Val Pro Phe Leu Gly Thr 

450 455 460 

Phe His Gly Ser Asp Leu Leu Phe Gin Tyr Tyr Val Asp Leu Gly Pro 
465 470 475 

Ser Ser Ala Tyr Arg Arg Tyr Phe lie Ser Phe Ala Asn His His Asp 
4S0 485 490 

Pro Asn Val Gly Thr Asn Leu Gin Gin Trp Asp Met Tyr Thr Asp Ala 
495 500 505 

Gly Lys Glu Met Leu Gin He His Met He Gly Asn Ser Met Arg Thr 
510 515 520 525 

Asp Asp Phe Arg He Glu Gly He Ser Asn Phe Glu Ser Asp Val Thr 

530 535 540 

Leu Phe Gly 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS : single 
• (D) TOPOLOGY: linear 

»» 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipo3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

GGGGCGGCCG CGCAGGCCCC AAGGCGGTCT CTCAAT 36 
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(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

<B) CLONE: Part non-coding strand lipasell 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

ATTGAGAGAC CGCCGTGGGG CCTGGGCCAG 30 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part coding strand lipasell 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

CAAACTTTGA GACTGACGTT AATCTCTACG GTTAAAAC 38 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: -32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipo4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCCCGCTAGC ACCGTAGAGA TTAACGTCAG TC 32 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE : DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipo5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CCCGCGGCCG CGAGCATTGA TGGTGGTATC 30 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part non-coding strand lipase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 



GATACCACGA TCAATGCT 



18 
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(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part coding strand lipase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

AACACAGGCC TCTGTACT 18 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE ^SOURCE : 

(B) CLONE: primer lipo6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

CCGCGCTAGC AGTACAGAGG CCTGTGTT 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2685 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cerevisiae 

{vii) IMMEDIATE SOURCE: 

(B) CLONE: pYY105 

( ix ) FEATURE : 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1..2685 

(D) OTHER INFORMATION: /product- "Flocculation protein** 
/gene= "FLOl" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATG ACA ATG CCT CAT CGC TAT ATG TTT TTG GCA GTC TTT ACA CTT CTG 48 
Met Thr Met Pro His Arg Tyr Met Phe Leu Ala Val Phe Thr Leu Leu 
15 10 15 

GCA CTA ACT AGT GTG GCC TCA GGA GCC ACA GAG GCG TGC TTA CCA GCA 96 
Ala Leu Thr Ser Val Ala Ser Gly Ala Thr Glu Ala Cys Leu Pro Ala 
20 25 30 

GGC GAG AGG AAA AGT GGG ATG AAT ATA AAT TTT TAC GAG TAT TCA TTG 144 
Gly Gin Arg Lys Ser Gly Met Asn He Asn Phe Tyr Gin Tyr Ser Leu 
35 40 45 

AAA GAT TCC TCC ACA TAT TCG AAT GCA GCA TAT ATG GCT TAT GGA TAT 192 
Lys Asp Ser Ser Thr Tyr Ser Asn Ala Ala Tyr Met Ala Tyr Gly Tyr 
50 „ 55 60 

GCC TCA AAA ACC AAA CTA GGT TCT GTC GGA GGA CAA ACT GAT ATC TCG 240 
Ala Ser Lys Thr Lys Leu Gly Ser Val Gly Gly Gin Thr Asp He Ser 
65 70 75 80 

ATT GAT TAT AAT ATT CCC TGT GTT AGT TCA TCA GGC ACA TTT CCT TGT 288 
He Asp Tyr Asn He Pro Cys Val Ser Ser Ser Gly Thr Phe Pro Cys 

85 90 95 

CCT CAA GAA GAT TCC TAT GGA AAC TGG GGA TGC AAA GGA ATG GGT GCT 336 
Pro Gin Glu Asp Ser Tyr Gly Asn Trp Gly Cys Lys Gly Met Gly Ala 
100 105 110 



TGT TCT AAT AGT 
Cys Ser Asn Ser 
115 



CAA GGA ATT GCA TAC TGG AGT ACT GAT TTA TTT GGT 
Gin Gly He Ala Tyr Trp Ser Thr Asp Leu Phe Gly 
120 125 



384 
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TTC TAT ACT ACC CCA ACA AAC GTA ACC CTA GAA ATG ACA GGT TAT TTT 432 
Phe Tyr Thr Thr Pro Thr Asn Val Thr Leu Glu Met Thr Gly Tyr Phe 
130 135 140 

TTA CCA CCA CAG ACG GGT TCT TAC ACA TTC AAG TTT GCT ACA GTT GAC 480 
Leu Pro Pro Gin Thr Gly Ser Tyr Thr Phe Lys Phe Ala Thr Val Asp 
145 150 155 160 

GAC TCT GCA ATT CTA TCA GTA GGT GGT GCA ACC GCG TTC AAC TGT TGT 528 
Asp Ser Ala lie Leu Ser Val Gly Gly Ala Thr Ala Phe Asn Cys Cys 

165 170 175 

GCT CAA CAG CAA CCG CCG ATC ACA TCA ACG AAC TTT ACC ATT GAC GGT 576 
Ala Gin Gin Gin Pro Pro He Thr Ser Thr Asn Phe Thr He Asp Gly 
180 185 190 

ATC AAG CCA TGG GGT GGA AGT TTG CCA CCT AAT ATC GAA GGA ACC GTC 624 
He Lys Pro Trp Gly Gly Ser Leu Pro Pro Asn He Glu Gly Thr Val 
195 200 205 

TAT ATG TAC GCT GGC TAC TAT TAT CCA ATG AAG GTT GTT TAC TCG AAC 672 
Tyr Met Tyr Ala Gly Tyr Tyr Tyr Pro Met Lys Val Val Tyr Ser Asn 
210 215 220 

GCT GTT TCT TGG GGT ACA CTT CCA ATT AGT GTG ACA CTT CCA GAT GGT 720 
Ala Val Ser Trp Gly Thr Leu Pro He Ser Val Thr Leu Pro Asp Gly 
225 230 235 240 

ACC ACT GTA AGT GAT GAC TTC GAA GGG TAC GTC TAT TCC TTT GAC GAT 768 
Thr Thr Val Ser Asp Asp Phe Glu Gly Tyr Val Tyr Ser Phe Asp Asp 

245 250 255 

GAC CTA AGT CAA TCT AAC TGT ACT GTC CCT GAC CCT TCA AAT TAT GCT 816 
Asp Leu Ser Gin Ser Asn Cys Thr val Pro Asp Pro Ser Asn Tyr Ala 
260 265 270 

GTC AGT ACC ACT ACA ACT ACA ACG GAA CCA TGG ACC GGT ACT TTC ACT 864 
Val Ser Thr Thr Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr 
275 280 285 

TCT ACA TCT ACT GAA ATG ACC ACC GTC ACC GGT ACC AAC GGC GTT CCA 912 
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Val Pro 
290 295 300 
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ACT GAC GAA ACC GTC ATT GTC ATC AGA ACT CCA ACC AGT GAA GGT CTA 960 
Thr Asp Glu Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu 
305 310 315 320 

ATC AGC ACC ACC ACT GAA CCA TGG ACT GGC ACT TTC ACT TCG ACT TCC 1008 
He Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser 

325 330 335 

ACT GAG GTT ACC ACC ATC ACT GGA ACC AAC GGT CAA CCA ACT GAC GAA 1056 
Thr Glu Val Thr Thr He Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu 
340 345 350 

ACT GTG ATT GTT ATC AGA ACT CCA ACC AGT GAA GGT CTA ATC AGC ACC 1104 
Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu He Ser Thr 
. 355 360 365 

ACC ACT GAA CCA TGG ACT GGT ACT TTC ACT TCT ACA TCT ACT GAA ATG 1152 
Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met 
370 375 380 

ACC ACC GTC ACC GGT ACT AAC GGT CAA CCA ACT GAC GAA ACC GTG ATT 1200 
Thr Thr Val Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu Thr Val He 
385 390 395 400 

GTT ATC AGA ACT CCA ACC AGT GAA GGT TTG GTT ACA ACC ACC ACT GAA 1248 
Val He Arg Thr Pro Thr Ser Glu Gly Leu Val Thr Thr Thr Thr Glu 

405 410 415 

CCA TGG ACT GGT ACT ~TTT ACT TCG ACT TCC ACT GAA ATG TCT ACT GTC 1296 
Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Ser Thr Val 
420 425 430 

ACT GGA ACC AAT GGC TTG CCA ACT GAT GAA ACT GTC ATT GTT GTC AAA 1344 
Thr Gly Thr Asn Gly Leu Pro Thr Asp Glu Thr Val He Val Val Lys 
435 440 445 

ACT CCA ACT ACT GCC ATC TCA TCC AGT TTG TCA TCA TCA TCT TCA GGA 1392 
Thr Pro Thr Thr Ala He Ser Ser Ser Leu Ser Ser Ser Ser Ser Gly 
450 455 460 

CAA ATC ACC AGC TCT ATC ACG TCT TCG CGT CCA ATT ATT ACC CCA TTC 1440 
Gin He Thr Ser Ser He Thr Ser Ser Arg Pro He He Thr Pro Phe 
465 470 475 480 
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TAT CCT AGC AAT GGA ACT TCT GTG ATT TCT TCC TCA GTA ATT TCT TCC 1488 
Tyr Pro Ser Asn Gly Thr Ser Val lie Ser Ser Ser Val lie Ser Ser 

485 490 495 

TCA GTC ACT TCT TCT CTA TTC ACT TCT TCT CCA GTC ATT TCT TCC TCA 1536 
Ser Val Thr Ser Ser Leu Phe Thr Ser Ser Pro Val lie Ser Ser Ser 
500 505 510 

GTC ATT TCT TCT TCT ACA ACA ACC TCC ACT TCT ATA TTT TCT GAA TCA 1584 
Val lie Ser Ser Ser Thr Thr Thr Ser Thr Ser lie Phe Ser Glu Ser 
515 520 525 

TCT AAA TCA TCC GTC ATT CCA ACC AGT AGT TCC ACC TCT GGT TCT TCT 1632 
Ser Lys Ser Ser Val lie Pro Thr Ser Ser Ser Thr Ser Gly Ser Ser 
530 535 540 

GAG AGC GAA ACG AGT TCA GCT GGT TCT GTC TCT TCT TCC TCT TTT ATC 1680 
Glu Ser Glu Thr Ser Ser Ala Gly Ser Val Ser Ser Ser Ser Phe lie 
545 550 555 S60 

TCT TCT GAA TCA TCA AAA TCT CCT ACA TAT TCT TCT TCA TCA TTA CCA 1728 
Ser Ser Glu Ser Ser Lys Ser Pro Thr Tyr Ser Ser Ser Ser Leu Pro 

565 570 575 

CTT GTT ACC AGT GCG ACA ACA AGC CAG GAA ACT GCT TCT TCA TTA CCA 1776 
Leu Val Thr Ser Ala Thr Thr Ser Gin Glu Thr Ala Ser Ser Leu Pro 
580 585 590 

CCT GCT ACC ACT ACA AAA ACG AGC GAA CAA ACC ACT TTG GTT ACC GTG 1824 
Pro Ala Thr Thr Thr Lys Thr Ser Glu Gin Thr Thr Leu Val Thr Val 
595 600 605 

ACA TCC TGC GAG TCT CAT GTG TGC ACT GAA TCC ATC TCC CCT GCG ATT 1872 
Thr Ser Cys Glu Ser His Val Cys Thr Glu Ser He Ser Pro Ala He 
610 615 620 

GTT TCC ACA GCT ACT GTT ACT GTT AGC GGC GTC ACA ACA GAG TAT ACC 1920 
Val Ser Thr Ala Thr Val Thr Val Ser Gly Val Thr Thr Glu Tyr Thr 
625 630 635 640 

ACA TGG TGC CCT ATT TCT ACT ACA GAG ACA ACA AAG CAA ACC AAA GGG 1968 
Thr Trp Cys Pro He Ser Thr Thr Glu Thr Thr Lys Gin Thr Lys Gly 

645 650 655 
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ACA ACA GAG CAA ACC ACA GAA AC A ACA AAA CAA ACC ACG GTA GTT ACA 2016 
Thr Thr Glu Gin Thr Thr Glu Thr Thr Lys Gin Thr Thr Val Val Thr 
660 665 670 

ATT TCT TCT TGT GAA TCT GAC GTA TGC TCT AAG ACT GCT TCT CCA GCC 2064 
lie Ser Ser Cys Glu Ser Asp Val Cys Ser Lys Thr Ala Ser Pro Ala 
675 680 685 

ATT GTA TCT ACA AGC ACT GCT ACT ATT AAC GGC GTT ACT ACA GAA TAC 2112 
He Val Ser Thr Ser Thr Ala Thr He Asn Gly Val Thr Thr Glu Tyr 
690 695 700 

ACA ACA TGG TGT CCT ATT TCC ACC ACA GAA TCG AGG CAA CAA ACA ACG 2160 
Thr Thr Trp Cys Pro He Ser Thr Thr Glu Ser Arg Gin Gin Thr Thr 
705 710 715 720 

CTA GTT ACT GTT ACT TCC TGC GAA TCT GGT GTG TGT TCC GAA ACT GCT 2208 
Leu Val Thr Val Thr Ser Cys Glu Ser Gly Val Cys Ser Glu Thr Ala 

725 730 735 

TCA CCT GCC ATT GTT TCG ACG GCC ACG GCT ACT GTG AAT GAT GTT GTT 2256 
Ser Pro Ala He Val Ser Thr Ala Thr Ala Thr Val Asn Asp Val Val 
740 745 750 

ACG GTC TAT CCT ACA TGG AGG CCA CAG ACT GCG AAT GAA GAG TCT GTC 2304 
Thr Val Tyr Pro Thr Trp Arg Pro Gin Thr Ala Asn Glu Glu Ser Val 
755 760 765 

AGC TCT AAA ATG AACJVGT GCT ACC GGT GAG ACA ACA ACC AAT ACT TTA 2352 
Ser Ser Lys Met Asn Ser Ala Thr Gly Glu Thr Thr Thr Asn Thr Leu 
770 775 780 

GCT GCT GAA ACG ACT ACC AAT ACT GTA GCT GCT GAG ACG ATT ACC AAT 2400 
Ala Ala Glu Thr Thr Thr Asn Thr Val Ala Ala Glu Thr He Thr Asn 
785 790 795 800 

ACT GGA GCT GCT GAG ACG AAA ACA GTA GTC ACC TCT TCG CTT TCA AGA 2448 
Thr Gly Ala Ala Glu Thr Lys Thr Val Val Thr Ser Ser Leu Ser Arg 

805 810 815 

TCT AAT CAC GCT GAA ACA CAG ACG GCT TCC GCG ACC GAT GTG ATT GGT 2496 
Ser Asn His Ala Glu Thr Gin Thr Ala Ser Ala Thr Asp Val He Gly 
820 825 830 
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CAC AGO AGT AGT GTT GTT TCT GTA TCC GAA ACT GGC AAC ACC AAG AGT 2544 
His Ser Ser Ser Val Val Ser Val Ser Glu Thr Gly Asn Thr Lys Ser 
835 840 845 

CTA ACA AGT TCC GGG TTG AGT ACT ATG TCG CAA CAG CCT CGT AGC ACA 2592 
Leu Thr Ser Ser Gly Leu Ser Thr Met Ser Gin Gin Pro Arg Ser Thr 
850 855 860 

CCA GCA AGC AGC ATG GTA GGA TAT AGT ACA GCT TCT TTA GAA ATT TCA 2640 
Pro Ala Ser Ser Met Val Gly Tyr Ser Thr Ala Ser Leu Glu lie Ser 
865 870 875 880 

ACG TAT GCT GGC AGT GCA ACA GCT TAC TGG CCG GTA GTG GTT TAA 2686 
Thr Tyr Ala Gly ser Ala Thr Ala Tyr Trp Pro Val Val Val 

885 890 895 



(2) INFORMATION FOR SEQ ID NO: 22: 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 894 amino acids 
<B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Met Thr Met Pro His Arg Tyr Met Phe Leu Ala Val Phe Thr Leu Leu 
15 10 15 

Ala Leu Thr Ser Val Ala Ser Gly Ala Thr Glu Ala Cys Leu Pro Ala 
20 25 30 

Gly Gin Arg Lys Ser Gly Met Asn lie Asn Phe Tyr Gin Tyr Ser Leu 
35 40 ' 45 

Lys Asp Ser Ser Thr Tyr Ser Asn Ala Ala Tyr Met Ala Tyr Gly Tyr 
50 55 60 

Ala ser Lys Thr Lys Leu Gly Ser Val Gly Gly Gin Thr Asp lie Ser 
65 70 75 80 



He Asp Tyr Asn He Pro Cys Val Ser Ser Ser Gly Thr Phe Pro Cys 

85 90 95 
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Pro Gin Glu Asp Ser Tyr Gly Asn Trp Gly Cys Lys Gly Met Gly Ala 
100 105 110 

Cys Ser Asn Ser Gin Gly lie Ala Tyr Trp Ser Thr Asp Leu Phe Gly 
115 120 125 



Phe Tyr Thr Thr Pro Thr Asn Val Thr Leu Glu Met Thr Gly Tyr Phe 
130 135 140 

Leu Pro Pro Gin Thr Gly Ser Tyr Thr Phe Lys Phe Ala Thr Val Asp 
145 150 155 160 



Asp Ser Ala lie Leu Ser Val Gly Gly Ala Thr Ala Phe Asn Cys Cys 

165 170 175 

Ala Gin Gin Gin Pro Pro lie Thr Ser Thr Asn Phe Thr lie Asp Gly 
180 185 190 



He Lys Pro Trp Gly Gly Ser Leu Pro Pro Asn He Glu Gly Thr Val 
195 200 205 

Tyr Met Tyr Ala Gly Tyr Tyr Tyr Pro Met Lys Val Val Tyr Ser Asn 
210 215 220 

Ala Val Ser Trp Gly Thr Leu Pro He Ser Val Thr Leu Pro Asp Gly 
225 230 235 240 



Thr Thr Val Ser Asp Asp Phe Glu Gly Tyr Val Tyr Ser Phe Asp Asp 

245„ 250 255 

Asp Leu Ser Gin Ser Asn Cys Thr Val Pro Asp Pro Ser Asn Tyr Ala 
260 265 270 

Val Ser Thr Thr Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr 
275 280 285 



Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Val Pro 
290 295 300 

Thr Asp Glu Thr Val lie Val He Arg Thr Pro Thr Ser Glu Gly Leu 

305 310 315 320 



He Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser 

325 330 335 
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Thr Glu Val Thr Thr lie Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu 
340 345 350 

Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu He Ser Thr 
355 360 365 

Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met 
370 375 380 

Thr Thr Val Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu Thr Val He 
385 390 395 400 

Val He Arg Thr Pro Thr Ser Glu Gly Leu Val Thr Thr Thr Thr Glu 

405 410 415 

Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Ser Thr Val 
420 425 430 

Thr Gly Thr Asn Gly Leu Pro Thr Asp Glu Thr Val He Val Val Lys 
435 440 445 

Thr Pro Thr Thr Ala He Ser Ser Ser Leu Ser Ser Ser Ser Ser Gly 
450 455 460 

Gin He Thr Ser Ser He Thr Ser Ser Arg Pro He He Thr Pro Phe 
465 470 475 480 

Tyr Pro Ser Asn Gly Thr Ser Val He Ser Ser Ser Val He Ser Ser 

485 490 495 

Ser Val Thr Ser Ser Leu Phe Thr Ser Ser Pro Val He Ser Ser Ser 
500 505 510 

Val He Ser Ser Ser Thr Thr Thr Ser Thr Ser He Phe Ser Glu Ser 
S15 520 525 

Ser Lys Ser Ser Val He Pro Thr Ser Ser Ser Thr Ser Gly Ser Ser 
530 535 540 



Glu Ser Glu Thr Ser Ser Ala Gly Ser Val Ser Ser Ser Ser Phe He 
545 550 555 560 



Ser Ser Glu Ser Ser Lys Ser Pro Thr Tyr Ser Ser Ser Ser Leu Pro 

565 570 575 
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Leu Val Thr Ser Ala Thr Thr Ser Gin Glu Thr Ala Ser Ser Leu Pro 
580 585 590 

Pro Ala Thr Thr Thr Lys Thr Ser Glu Gin Thr Thr Leu Val Thr Val 
595 600 605 

Thr Ser Cys Glu Ser His Val Cys Thr Glu Ser He Ser Pro Ala He 
610 615 620 

Val Ser Thr Ala Thr Val Thr Val Ser Gly Val Thr Thr Glu Tyr Thr 
525 630 635 640 

Thr Trp Cys Pro He Ser Thr Thr Glu Thr Thr Lys Gin Thr Lys Gly 

645 650 655 

Thr Thr Glu Gin Thr Thr Glu Thr Thr Lys Gin Thr Thr Val Val Thr 
660 665 670 

He Ser Ser Cys Glu Ser Asp Val Cys Ser Lys Thr Ala Ser Pro Ala 
675 680 685 

He Val Ser Thr Ser Thr Ala Thr He Asn Gly Val Thr Thr Glu Tyr 
690 695 700 

Thr Thr Trp Cys Pro He Ser Thr Thr Glu Ser Arg Gin Gin Thr Thr 
? 05 710 715 720 

Leu Val Thr Val Thr Ser Cys Glu Ser Gly Val Cys Ser Glu Thr Ala 

725« 730 735 

Ser Pro Ala He Val Ser Thr Ala Thr Ala Thr Val Asn Asp Val Val 
740 745 750 

Thr Val Tyr Pro Thr Trp Arg Pro Gin Thr Ala Asn Glu Glu Ser Val 
755 760 765 

Ser Ser Lys Met Asn Ser Ala Thr Gly Glu Thr Thr Thr Asn Thr Leu 
770 775 780 



Ala Ala Glu Thr Thr Thr Asn Thr Val Ala Ala Glu Thr He Thr Asn 
785 790 795 800 



Thr Gly Ala Ala Glu Thr Lys Thr Val Val Thr Ser Ser Leu Ser Arg 

80S 810 815 
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Ser Asn His Ala Glu Thr Gin Thr Ala Ser Ala Thr Asp Val He Gly 
820 825 830 

His Ser Ser Ser Val Val Ser Val Ser Glu Thr Gly Asn Thr Lys Ser 
835 840 845 

Leu Thr Ser Ser Gly Leu Ser Thr Met Ser Gin Gin Pro Arg Ser Thr 
850 855 860 

Pro Ala Ser Ser Met Val Gly Tyr Ser Thr Ala Ser Leu Glu He Ser 
865 870 875 880 

Thr Tyr Ala Gly Ser Ala Thr Ala Tyr Trp Pro Val Val Val 

885 890 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS ; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer pcrflol 

*i 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GAATTCGCTA GCAATTATGC TGTCAGTACC 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part non-coding sequence FLOl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

AGTGGTACTG ACAGCATAAT TTGA 



(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part coding sequence FLOl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AATAAAATTC GCGTTCTTTT TACG 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer pcrflo2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GAGCTCAAGC TTCGTAAAAA GAACGCGAAT T 
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CLAIMS 

1. A method for immobilizing an enzyme, comprising the use of recombinant DNA 
techniques for producing an enzyme or a functional part thereof linked to the cell 
wall of a host cell, preferably a microbial cell, and whereby the enzyme or 
functional fragment thereof is localized at the exterior of the cell wall. 

2. The method of claim 1, wherein the enzyme or the functional part thereof is 
immobilized by linking to the C-terminal part of a protein that ensures anchoring 
in the cell wall. 

3. A recombinant polynucleotide comprising a structural gene encoding a protein 
providing catalytic activity and at least a part of a gene encoding a protein 
capable of anchoring in a eukaryotic or prokaryotic cell wall, said part encoding 
at least the C-terminal part of said anchoring protein. 

4. The polynucleotide of claim 3, further comprising a sequence encoding a signal 
peptide ensuring secretion of the expression product of the polynucleotide. 

5. The polynucleotide of claim 4, wherein the signal peptide is derived from a 
protein selected from the group consisting of glycosyl-pbosphatidyl-inositoi (GPI) 
anchoring protein, a -factor, a -agglutinin, invertase or inulinase, a -amylase of 
Bacillus, and proteinases of lactic acid bacteria. 

6. The polynucleotide of any of claims 3-5, wherein the protein capable of anchoring 
in the cell wall is selected from the group consisting of a-agglutinin, AGA1, 
FLOl, Major Cell Wall Protein of lower eukaryotes, and proteinases of lactic 
acid bacteria. 

7. The polynucleotide of any of claims 3-6, operably linked to a promoter, 
preferably an inducible promoter. 
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8. The polynucleotide of any of claims 3-7, wherein the protein providing catalytic 
activity is a hydrolytic enzyme, e.g. a lipase. 

9. The polynucleotide of any of claims 3-7, wherein the protein providing catalytic 
activity is an oxidoreductase, e.g. an oxidase. 



10. A recombinant vector comprising a polynucleotide as claimed in any of claims 
3-9. 



11. The recombinant vector of claim 10, wherein the protein providing catalytic 
activity exhibits said catalytic activity when present in a multimeric form, said 
vector further comprising a second polynucleotide comprising a structural gene 
encoding the same protein providing catalytic activity combined with a sequence 
encoding a signal peptide ensuring secretion of the expression product of said 
second polynucleotide, said second polynucleotide being operably linked to a 
regulatable promoter, preferably an inducible or repressible promoter. 

12. A chimeric protein encoded by a polynucleotide as claimed in any of claims 3-9. 

13. A host cell, preferably a microorganism, containing a polynucleotide as claimed in 
any of claims 3-Slor a vector as claimed in claim 10 or 11. 

14. A host cell, preferably a microorganism, containing a polynucleotide as claimed in 
any of claims 3-9 or a vector as claimed in claim 10, wherein the protein 
providing catalytic activity exhibits said catalytic activity when present in a 
multimeric form, said microorganism further comprising a second polynucleotide 
comprising a structural gene encoding the same protein providing catalytic activity 
combined with a sequence encoding a signal peptide ensuring secretion of the 
expression product of said second polynucleotide, said second polynucleotide 
being operably linked to a regulatable promoter, preferably an inducible or 
repressible promoter and said second polynucleotide being present either in 
another vector or in the chromosome of said microorganism. 
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15. The host cell or microorganism of claim 13 or 14, having at least one of said 
polynucleotides integrated in its chromosome. 

16. A host cell, preferably a microorganism, having a protein as claimed in claim 12 
immobilized on its cell wall. 

17. The host cell or microorganism of any of claims 13-16, which is a lower 
eukaryote, in particular a yeast 

18. A process for carrying out an enzymatic process by using an immobilized 
catalytically active protein, wherein a substrate for said catalytically active protein 
is contacted with a host cell or microorganism as claimed in any of claims 13-17. 
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DNA SEQUENCE OF ALPHA-AGGLUTENIN : 

1 AAGCTTTAGG TAAGGGAGGC AGGGGGAAAA GATACTGAAA 
41 TGACGGAAAA CGAGAATATG GAGCAGGGAG CAACTTTTAG 
81 AGCTTTACCC GTTAAAAGGT CAAATCGAGG CTTCCTGCCT 
121 TTGTCTGATT TTAGTAGTAC CGGAAGGTTT ATTACGCCCA 
161 AGAACAGTGC TTGAATTGAG TTCTCGGGAC ACGGGAAAGA 
201 CAATGGAAGA AAAATTTACA TTCAGTAGCC TTATATATGA 
241 AATGCTGCCA AGCCACGTCT TTATAAGTAG ATAATGTCCC 
281 ATGAGCTGAA CTATGGGAAT TTATGACGCA GTTCATTGTA 
321 TATATATTAC ATTAACTCTT TAGTTTAACA TCTGAATTGT 
361 TTTATAAAAT AACTTTTTGA ATTTTTTTAT GATCGCTTAG 
401 TTAAGTCTAT TATATCAGGT TTTTTCATTC ATCATAATTG 
441 TTCGTTAAAT ATGAGTATAT TTAAATACAG GAATTAGTAT 
481 CATTTGCAGT CACGAAAAGG GCCGTTTCAT AGAGAGTTTT 
521 CTTAATAAAG TTGAGGGTTT CCGTGATAGT TTTGAGGGGT 
561 TGTTTGAACT AGATTTACGC TTACCTTTCA ACTGATTAAT 
601 TTTTTCAGCG GGCTTATCAT AATCATCCAT CATAGCAGTC 
641 TTTCTGGACT TCGTCGAGGA CTGGCTTTCT GAATTTTGAC 
681 GGTCCCTATT AGCTCCAGTT GGAGGAATTG AGTTACCTAC 
721 AACTGGCAAG AGGTCTTTGT TTGGATTCAA AATAGGACTT 
761 TGTGGTAGCA GTTTGGTTTT ATTCAATCTA AAGATATGAG 
801 AAACAGGTTT TAAGTAAATC GATACTATTG TACCAATGTT 
841 TAGCTCCAAT TCCTCCAAAA CGGTGGGATC TAATTTTGTG 
881 TTCATTTCTA TTAGTGGCAA CTCTCCGTCC AGTACTGATT 
921 TTAAAGATTC AAAAGTTATC GCGTTTGATA TACGAGACGT 
961 TTTCGTTAAT GACAGCAATC TCCAATACAT CAGTGTTTTA 
1001 TCTCTTAAGT CAGGATTATT TTCGTGATCG GTGCATCCTT 
1041 TTAATAAATC CATACAAAGT TCTTCAGTTT CCTTTGTAGG 
1081 ATTTCTGATG AAGAATTTTA TTGCTGAGTT CAGAATGGAA 
1121 AATTGCACTT CTAGCGTCTC ATTAAACATG TTTGAGGAAA 
1161 AAACTCTAAA TAACTCCAGG TAGTTTGGAA TTACATCCGA 
1201 ATATTGCGTT ATTATCCAGA TCATAGCGTT TTTTGATTCA 
1241 GGTTCCTGTA CAACTTCAGT GTGTTTGACT AGTTCTGTTA 
1281 CGTTTGCTTT AAAATTATTG GGATATTTCC TCAAAATATT 
1321 TCTGAAAACC GAAATAATCT CCTGGACGAC. ATAATCAACA 
1361 CCGAATTCTA ACAAATCTAG TAGCACAGCG ACACAATCGT 
14 01 GTACAGAGTC TTCATCTAGC TTAACAGCGA GATTACCAAT 
1441 GGCTCTGACT GATTTCCTTG ACATTTGAAT ATCAATATCT 
1481 GTAGCATATT GTTCCAACTC TTCTAGAATT CTTGGTAATG 
1521 TTTCCTTGTT AGCTAAAAGA TATAAACACT CTAATTTCGT 
1561 GTCTTTGATG TATATGGGGT CATTGTACTC GATGAAAAAA 
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1601 TACGAAATGT CTAGCCTGAG TAGAGATGAC TCCCTACTCA 
1641 ATAAAAGAAG AATAACGTTT CTTAATACTA AAAATTGTAA 
1681 TTCAGGCGGC TTATCTAACA AAGCTATTAC AGAGTTAGAT 
1721 AGCTTTTCGG CTAGAGTTTC TTTGATGACG TCAACATAAT 
1761 TCAACAAGTA CATGATGAAT TTTAAAGAGT TCAACACTAC 
1801 GTATGTGTTT ACTTGTTGCA GGTACGGTAA AGCTAGTTCG 
1841 ATCATTTCAT GGGTATCCAA ATAATGCTGC GGCACAACCG 
1881 AAGTCGTCAA AACTTCCAAA ACAGTAGCCT TATTCCACTC 
1921 ATTTAATTCG GGTAAAAGTT CTAGCATGTC AAAAGCGAGT 
1961 TCCAAGGGAA TCCTGAAGGT TCCATGTTAG CGTTTTTTTC 
2001 GTGAATGGAA TATAAAGTAT GTAATGCAGC TACAATGACT 
2041 TCTGGAGAGC TCGACTGTGC CTTTACAATG TCATGTAGAA 
2081 TGCTTGATAA CCCCAATACC CTTTCATGAT CAATTTCATC 
2121 TAAATCCAAC AGTGCGTAAA TTGCTGTCCT CGTCACTTGT 
2161 TCAGGTGGAG ACTTGTGATT TACCAATGAA ATGATACAGT 
2201 CGAAGGCCTG ATCAGATAGC TCTTTCACCG GGACTAATAC 
2241 CAGAGTTCTT AGTGCCATTA TTTGTAACTT TTCATCTCTG 
2281 CTTTTGAAAT CGTCCATTAT AAATGGCAAA GCCTCTCTGG 
2321 CCTGCTGAGG TTTTAATGCG CCGATCACCC TAATATACTC 
2361 ATGGCAAATT CTTTTCACTT CTAGATCATC TTCAATTTGC 
2401 CAAAATTTCA AGAGCTCAGA AAACAGAAGG GACATTTCGC 
2441 CATAGTTTCC TAGAACCAAA TTGGCGATAA TTTTTCTCAG 
2481 AGCATTTTTC CTTCTTGTTA TATTCGATTT AAACTTTTTT 
2521 ACTCCAAAAT GTTGCAGATC TGTGACGATT TCATTTGCTT 
2561 TATATCTGGC AAAAACTTTT TGATCGGACA TAAGCGAAAT 
2601 ACGTCCTATT AATGAAGTGA ATGTTCTTGC TGTATTCCCT 
2641 TCTTGTGCAG TAGATTAATT CTGTTTCCAG GCTGCGATAC 
2681 TTTGATACCC AATACTAAAA GTTGATGATT TGAACGATCT 
2721 CCTATTTCCT CGCACATTTT TGGAGCGATA CCCGGAAGAC 
2761 AGAATCGCGA TGTTAAGAAA ATAGTTCTGA TGGCACTAAA 
2801 GAGATCATGA TTAAGGAAAG GTAAGTGATA TGCATGAATG 
2841 GGAATAGGCT TTCGAACTTG ACGATTTAGT TCCTTATTTC 
2881 TATCCATCTA ATCCTCCAAC TTCAATAGGC CTTATCTAGC 
2921 TCAGAGCAGT ATTTAATTGA GAATAGTAGC TTAATTGAAA 
2961 CCTTACTAAA AAAGTGTATG GTTACATAAG ATAAGGCGTT 
3 001 AAGAAGAGTA TACATATGCA TTATTCATTA CCAAGACCAC 
3041 TATGAATAGT . . AATACCATAT TTAGCTTTTG AAACTCATGT 
3 081 TTTCTATTGT GTTGTTTCAA ATTCCTCTGT TAGGCTCAAT 
3121 TTAGGTTAAT TAAATTATAA AAAAATATAA AAAATAAAGA 
3161 AAGTTTATCC ATCGGCACCT CAATTCAATG GAGTAAACAG 
3201 TTTCAACACT GAGTGGTGAA ACATTGAACA ACTACATGCA 
3241 GTTTCCCGCC ACGAGGCAAG TGTAGGTCCT TTGTCCATTT 
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3281 CGCTTTGTTT TGCAGGTCAT TGATGACCTA ATTAGGAAGG 
3321 TAGAAGCCGC TCCAGCTCAA TAAGGAAATG CTAAGGGTAC 
3361 TCGCCTTTGG TGTTTTACCA TACAATGGCA GCTTTATGTC 
3401 ACTTCATTCT TCAGTAACGG CGCTTAAATA TTCCCAAAAA 
3441 CGTTACAATG GAATTGTTTG ATCATGTAAC GAAATGCAAT 
3481 CTTCTAAAAA AAAAGCCATG TGAATCAAAA AAAGATTCCT 
3521 TTTAGCATAC TATAAATATG CAAAATGCCC TCTATTTATT 
3561 CTAGTAATCG TCCATTCTCA TATCTTCCTT ATATCAGTCG 
3601 CCTCGCTTAA TATAGTCAGC ACAAAAGGAA CAACAATTCG 
3641 CCAGTTTTCA AAATGTTCAC TTTTCTCAAA ATTATTCTGT 
3681 GGCTTTTTTC CTTGGCATTG GCCTCTGCTA TAAATATCAA 
3721 CGATATCACA TTTTCCAATT TAGAAATTAC TCCACTGACT 
3761 GCAAATAAAC AACCTGATCA AGGTTGGACT GCCACTTTTG 
3801 ATTTTAGTAT TGCAGATGCG TCTTCCATTA GGGAGGGCGA 
3841 TGAATTCACA TTATCAATGC CACATGTTTA TAGGATTAAG 
3881 CTATTAAACT CATCGCAAAC AGCTACTATT TCCTTAGCGG 
3921 ATGGTACTGA GGCTTTCAAA TGCTATGTTT CGCAACAGGC 
3961 TGCATACTTG TATGAAAATA CTACTTTCAC ATGTACTGCT 
4001 CAAAATGACC TGTCCTCCTA TAATACGATT GATGGATCCA 
4041 TAACATTTTC GCTAAATTTT AGTGATGGTG GTTCCAGCTA 
4081 TGAATATGAG TTAGAAAACG CTAAGTTTTT CAAATCTGGG 
4121 CCAATGCTTG TTAAACTTGG TAATCAAATG TCAGATGTGG 
4161 TGAATTTCGA TCCTGCTGCT TTTACAGAGA ATGTTTTTCA 
4201 CTCTGGGCGT TCAACTGGTT ACGGTTCTTT TGAAAGTTAT 
4241 CATTTGGGTA TGTATTGTCC AAACGGATAT TTCCTGGGTG 
4281 GTACTGAGAA GATTGATTAC GACAGTTCCA ATAACAATGT 
4321 CGATTTGGAT TGTTCTTCAG TTCAGGTTTA TTCATCCAAT 
4361 GATTTTAAfG ATTGGTGGTT CCCGCAAAGT TACAATGATA 
4401 CCAATGCTGA CGTCACTTGT TTTGGTAGTA ATCTGTGGAT 
4441 TACACTTGAC GAAAAACTAT ATGATGGGGA AATGTTATGG 
4481 GTTAATGCAT TACAATCTCT ACCCGCTAAT GTAAACACAA 
4521 TAGATCATGC GTTAGAATTT CAATACACAT GCCTTGATAC 
4561 CATAGCAAAT ACTACGTACG CTACGCAATT CTCGACTACT 
4601 AGGGAATTTA TTGTTTATCA GGGTCGGAAC CTCGGTACAG 
4641 CTAGCGCCAA AAGCTCTTTT ATCTCAACCA CTACTACTGA 
4681 TTTAACAAGT ATAAACACTA GTGCGTATTC CACTGGATCC 
4721 ATTTCCACAG . TAGAAACAGG CAATCGAACT ACATCAGAAG 
4761 TGATCAGTCA TGTGGTGACT ACCAGCACAA AACTGTCTCC 
4801 AACTGCTACT ACCAGCCTGA CAATTGCACA AACCAGTATC 
4841 TATTCTACTG ACTCAAATAT CACAGTAGGA ACAGATATTC 
4881 ACACCACATC AGAAGTGATT AGTGATGTGG AAACCATTAG 
4921 CAGAGAAACA GCTTCGACCG TTGTAGCCGC TCCAACCTCA 
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4961 ACAACTGGAT GGACAGGCGC TATGAATACT TACATCCCGC 
5001 AATTTACATC CTCTTCTTTC GCAACAATCA ACAGCACACC 
5041 AATAATCTCT TCATCAGCAG TATTTGAAAC CTCAGATGCT 
5081 TCAATTGTCA ATGTGCACAC TGAAAATATC ACGAATACTG 
5121 CTGCTGTTCC ATCTGAAGAG CCCACTTTTG TAAATGCCAC 
5161 GAGAAACTCC TTAAATTCCT TCTGCAGCAG CAAACAGCCA 
5201 TCCAGTCCCT CATCTTATAC GTCTTCCCCA CTCGTATCGT 
5241 CCCTCTCCGT AAGCAAAACA TTACTAAGCA CCAGTTTTAC 
5281 GCCTTCTGTG CCAACATCTA ATACATATAT CAAAACGGAA 
5321 AATACGGGTT ACTTTGAGCA CACGGCTTTG ACAACATCTT 
5361 CAGTTGGCCT TAATTCTTTT AGTGAAACAG CACTCTCATC 
5401 TCAGGGAACG AAAATTGACA CCTTTTTAGT GTCATCCTTG 
5441 ATCGCATATC CTTCTTCTGC ATCAGGAAGC CAATTGTCCG 
5481 GTATCCAACA GAATTTCACA TCAACTTCTC TCATGATTTG 
5521 AACCTATGAA GGTAAAGCGT CTATATTTTT CTCAGCTGAG 
5561 CTCGGTTCGA TCATTTTTCT GCTTTTGTCG TACCTGCTAT 
5601 TCTAAAACGG GTACTGTACA GTTAGTACAT TGAGTCGAAA 
5641 TATACGAAAT TATTGTTCAT AATTTTCATC CTGGCTCTTT 
5681 TTTTCTTCAA CCATAGTTAA ATGGACAGTT CATATCTTAA 
5721 ACTCTAATAA TACTTTTCTA GTTCTTATCC TTTTCCGTCT 
5761 CACCGCAGAT TTTATCATAG TATTAAATTT ATATTTTGTT 
5801 CGTAAAAAGA AAAATTTGTG AGCGTTACCG CTCGTTTCAT 
5841 TACCCGAAGG CTGTTTCAGT AGACCACTGA TTAAGTAAGT 
5881 AGATGAAAAA ATTTCATCAC CATGAAAGAG TTCGATGAGA 
5921 GCTACTTTTT CAAATGCTTA ACAGCTAACC GCCATTCAAT 
5961 AATGTTACGT TCTCTTCATT CTGCGGCTAC GTTATCTAAC 
6001 AAGAGGTTTT ACTCTCTCAT ATCTCATTCA AATAGAAAGA 
6041 ACATAATCAA AAAGCTT 6057 
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FIGURE 8, 1/2 14/24 
DNA SEQUENCE OF LIPASE B: 

1 AATTCGGCAC GAGATTCCTT TGATTTGCAA CTGTTAATCA 
41 TGGTTTCCAA AAGCTTTTTT TTGGCTGCGG CGCTCAACGT 
81 AGTGGGCACC TTGGCCCAGG CCCCCACGGC CGTTCTTAAT 
121 GGCAACGAGG TCATCTCTGG TGTCCTTGAG GGCAAGGTTG 
161 ATACCTTCAA GGGAATCCCA TTTGCTGACC CTCCTGTTGG 
201 TGACTTGCGG TTCAAGCACC CCCAGCCTTT CACTGGATCC 
241 TACCAGGGTC TTAAGGCCAA CGACTTCAGC TCTGCTTGTA 
281 TGCAGCTTGA TCCTGGCAAT GCCTTTTCTT TGCTTGACAA 
321 AGTAGTGGGC TTGGGAAAGA TTCTTCCTGA TAACCTTAGA 
361 GGCCCTCTTT ATGACATGGC CCAGGGTAGT GTCTCCATGA 
401 ATGAGGACTG TCTCTACCTT AACGTTTTCC GCCCCGCTGG 
441 CACCAAGCCT GATGCTAAGC TCCCCGTCAT GGTTTGGATT 
481 TACGGTGGTG CCTTTGTGTT TGGTTCTTCT GCTTCTTACC 
521 CTGGTAACGG CTACGTCAAG GAGAGTGTGG AAATGGGCCA 
561 GCCTGTTGTG TTTGTTTCCA TCAACTACCG TACCGGCCCC 
601 TATGGATTCT TGGGTGGTGA TGCCATCACC GCTGAGGGCA 
641 ACACCAACGC TGGTCTGCAC GACCAGCGCA AGGGTCTCGA 
681 GTGGGTTAGC GACAACATTG CCAACTTTGG TGGTGATCCC 
721 GACAAGGTCA TGATTTTCGG TGAGTCCGCT GGTGCCATGA 
761 GTGTTGCTCA CCAGCTTGTT GCCTACGGTG GTGACAACAC 
801 CTACAACGGA AAGCAGCTTT TCCACTCTGC CATTCTTCAG 
841 TCTGGCGGTC CTCTTCCTTA CTTTGACTCT ACTTCTGTTG 
881 GTCCCGAGAG TGCCTACAGC AGATTTGCTC AGTATGCCGG 
921 ATGTGACACC AGTGCCAGTG ATAATGACAC TCTGGCTTGT 
961 CTCCGCAGCA AGTCCAGCGA TGTCTTGCAC AGTGCGCAGA 
1001 ACTCGTATGA TCTTAAGGAC CTGTTTGGTC TGCTCCCTCA 
1041 ATTCCTTGGA TTTGGTCCCA GACCCGACGG CAACATTATT 
1081 CCCGATGCCG CTTATGAGCT CTACCGCAGC GGTAGATACG 
1121 CCAAGGTTCC CTACATTACT GGCAACCAGG AGGATGAGGG 
1161 TACTATTCTT GCCCCCGTTG CTATTAATGC TACCACTACT 
1201 CCCCATGTTA AGAAGTGGTT GAAGTACATT TGTAGCCAGG 
1241 CTTCTGACGC TTCGCTTGAT CGTGTTTTGT CGCTCTACCC 
1281 CGGCTCTTGG TCGGAGGGTT CACCATTCCG CACTGGTATT 
1321 CTTAATGCTC TTACCCCTCA GTTCAAGCGC ATTGCTGCCA 
1361 TTTTCACTGA TTTGCTGTTC CAGTCTCCTC GTCGTGTTAT 
14 01 GCTTAACGCT ACCAAGGACG TCAACCGCTG GACTTACCTT 
1441 GCCACCCAGC TCCATAACCT CGTTCCATTT TTGGGTACTT 
1481 TCCATGGCAG TGATCTTCTT TTTCAATACT ACGTGGACCT 
1521 TGGCCCATCT TCTGCTTACC GCCGCTACTT TATCTCGTTT 
1561 GCCAACCACC ACGACCCCAA CGTTGGTACC AACCTCCAAC 



SUBSTITUTE SHEET 



WO 94/01567 

FIGURE 8, 2/2 



15/24 



PCT/EP93/01763 



1601 AGTGGGATAT GTACACTGAT GCAGGCAAGG AGATGCTTCA 
1641 GATTCATATG ATTGGTAACT CTATGAGAAC TGACGACTTT 
1681 AGAATCGAGG GAATCTCGAA CTTTGAGTCT GACGTTACTC 
1721 TCTTCGGTTA ATCCCATTTA GCAAGTTTTG TGTATTTCAA 
1761 GTATACCAGT TGATGTAATA TATCAATAGA TTACAAATTA 
1801 ATTAGTGAAA AAAAAAAAAA AAAAAAAC 1828 
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FIGURE 11, 1/2 19/24 
DNA SEQUENCE OF FLOl : 

1 ATGACAATGC CTCATCGCTA TATGTTTTTG GCAGTCTTTA 
41 CACTTCTGGC ACTAACTAGT GTGGCCTCAG GAGCCACAGA 
81 GGCGTGCTTA CCAGCAGGCC AGAGGAAAAG TGGGATGAAT 
121 ATAAATTTTT ACCAGTATTC ATTGAAAGAT TCCTCCACAT 
161 ATTCGAATGC AGCATATATG GCTTATGGAT ATGCCTCAAA 
201 AACCAAACTA GGTTCTGTCG GAGGACAAAC TGATATCTCG 
241 ATTGATTATA ATATTCCCTG TGTTAGTTCA TCAGGCACAT 
281 TTCCTTGTCC TCAAGAAGAT TCCTATGGAA ACTGGGGATG 
321 CAAAGGAATG GGTGCTTGTT CTAATAGTCA AGGAATTGCA 
361 TACTGGAGTA CTGATTTATT TGGTTTCTAT ACTACCCCAA 
401 CAAACGTAAC CCTAGAAATG ACAGGTTATT TTTTACCACC 
441 ACAGACGGGT TCTTACACAT TCAAGTTTGC TACAGTTGAC 
481 GACTCTGCAA TTCTATCAGT AGGTGGTGCA ACCGCGTTCA 
521 ACTGTTGTGC TCAACAGCAA CCGCCGATCA CATCAACGAA 
561 CTTTACCATT GACGGTATCA AGCCATGGGG TGGAAGTTTG 
601 CCACCTAATA TCGAAGGAAC CGTCTATATG TACGCTGGCT 
641 ACTATTATCC AATGAAGGTT GTTTACTCGA ACGCTGTTTC 
681 TTGGGGTACA CTTCCAATTA GTGTGACACT TCCAGATGGT 
721 ACCACTGTAA GTGATGACTT CGAAGGGTAC GTCTATTCCT 
761 TTGACGATGA CCTAAGTCAA TCTAACTGTA CTGTCCCTGA 
801 CCCTTCAAAT TATGCTGTCA GTACCACTAC AACTACAACG 
841 GAACCATGGA CCGGTACTTT CACTTCTACA TCTACTGAAA 
881 TGACCACCGT CACCGGTACC AACGGCGTTC CAACTGACGA 
921 AACCGTCATT GTCATCAGAA CTCCAACCAG TGAAGGTCTA 
961 ATCAGCACCA CCACTGAACC ATGGACTGGC ACTTTCACTT 
1001 CGACTTCCAC TGAGGTTACC ACCATCACTG GAACCAACGG 
1041 TCAACCAACT GACGAAACTG TGATTGTTAT CAGAACTCCA 
1081 ACCAGTGAAG GTCTAATCAG CACCACCACT GAACCATGGA 
1121 CTGGTACTTT CACTTCTACA TCTACTGAAA TGACCACCGT 
1161 CACCGGTACT AACGGTCAAC CAACTGACGA AACCGTGATT 
1201 GTTATCAGAA CTCCAACCAG TGAAGGTTTG GTTACAACCA 
1241 CCACTGAACC ATGGACTGGT ACTTTTACTT CGACTTCCAC 
1281 TGAAATGTCT ACTGTCACTG GAACCAATGG CTTGCCAACT 
1321 GATGAAACTG TCATTGTTGT CAAAACTCCA ACTACTGCCA 
1361 TCTCATCCAG TTTGTCATCA TCATCTTCAG GACAAATCAC 
1401 CAGCTCTATC ACGTCTTCGC GTCCAATTAT TACCCCATTC 
1441 TATCCTAGCA ATGGAACTTC TGTGATTTCT TCCTCAGTAA 
1481 TTTCTTCCTC AGTCACTTCT TCTCTATTCA CTTCTTCTCC 
1521 AGTCATTTCT TCCTCAGTCA TTTCTTCTTC TACAACAACC 
1561 TCCACTTCTA TATTTTCTGA ATCATCTAAA TCATCCGTCA 
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FIGURE 11, 2/2 

16 01 TTCCAACCAG TAGTTCCACC TCTGGTTCTT CTGAGAGCGA 
1641 AACGAGTTCA GCTGGTTCTG TCTCTTCTTC CTCTTTTATC 
1681 TCTTCTGAAT CATCAAAATC TCCTACATAT TCTTCTTCAT 
1721 CATTACCACT TGTTACCAGT GCGACAACAA GCCAGGAAAC 
1761 TGCTTCTTCA TTACCACCTG CTACCACTAC AAAAACGAGC 
1801 GAACAAACCA CTTTGGTTAC CGTGACATCC TGCGAGTCTC 
1841 ATGTGTGCAC TGAATCCATC TCCCCTGCGA TTGTTTCCAC 
1881 AGCTACTGTT ACTGTTAGCG GCGTCACAAC AGAGTATACC 
1921 ACATGGTGCC CTATTTCTAC TACAGAGACA ACAAAGCAAA 
1961 CCAAAGGGAC AACAGAGCAA ACCACAGAAA CAACAAAACA 
2001 AACCACGGTA GTTACAATTT CTTCTTGTGA ATCTGACGTA 
2041 TGCTCTAAGA CTGCTTCTCC AGCCATTGTA TCTACAAGCA 
2081 CTGCTACTAT TAACGGCGTT ACTACAGAAT ACACAACATG 
2121 GTGTCCTATT TCCACCACAG AATCGAGGCA ACAAACAACG 
2161 CTAGTTACTG TTACTTCCTG CGAATCTGGT GTGTGTTCCG 
2201 AAACTGCTTC ACCTGCCATT GTTTCGACGG CCACGGCTAC 
2241 TGTGAATGAT GTTGTTACGG TCTATCCTAC ATGGAGGCCA 
2281 CAGACTGCGA ATGAAGAGTC TGTCAGCTCT AAAATGAACA 
2321 GTGCTACCGG TGAGACAACA ACCAATACTT TAGCTGCTGA 
2361 AACGACTACC AATACTGTAG CTGCTGAGAC GATTACCAAT 
2401 ACTGGAGCTG CTGAGACGAA AACAGTAGTC ACCTCTTCGC 
2441 TTTCAAGATC TAATCACGCT GAAACACAGA CGGCTTCCGC 
2481 GACCGATGTG ATTGGTCACA GCAGTAGTGT TGTTTCTGTA 
2521 TCCGAAACTG GCAACACCAA GAGTCTAACA AGTTCCGGGT 
2561 TGAGTACTAT GTCGCAACAG CCTCGTAGCA CACCAGCAAG 
2601 CAGCATGGTA GGATATAGTA CAGCTTCTTT AGAAATTTCA 
2641 ACGTATGCTG GCAGTGCAAC AGCTTACTGG CCGGTAGTGG 
2681 TTTAA 2685- 
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Fig. 12. 

CONSTRUCTION OF pUR2990 

PCR with oligonucleotides pcrflol & pcrflo2 

Isolate 1950 bp fragment 

cut with Nhel and Hindlll 

ligate into Hindlll/ Nhel (p) digested P UR2972 
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[57] ABSTRACT 

A method is provided for immobilizing an enzyme, com- 
prising immobilizing the enzyme or a functional part thereof 
to the cell wall of a microbial cell using recombinant DNA 
techniques. The enzyme is immobilized by linking it to the 
C-terminal part of a protein that ensures anchoring in the cell 
wall. Also provided is a recombinant polynucleotide com- 
prising a structural gene encoding an enzyme protein, a part 
of a gene encoding the C-terminal part of a protein capable 
of anchoring in a eukaryotic or prokaryotic cell wall, as well 
as a signal sequence, in addition to a chimeric protein 
encoded by the recombinant polynucleotide and a vector and 
a microorganism containing the polynucleotide. The micro- 
organism is suitable for carrying out enzymatic processes on 
an industrial scale. 



8907140 8/1989 WIPO . 



17 Claims, 24 Drawing Sheets 



U.S. Patent Feb. 22, 2000 sheet 1 of 24 6,027,910 

FIG. IA 

1 AAGCTTTAGG TAAGGGAGGC AGGGGGAAAA GATACTGAAA 
41 TGACGGAAAA CGAGAATATG GAGCAGGGAG CAACTTTTAG 
81 AGCTTTACCC GTTAAAAGGT CAAATCGAGG CTTCCTGCCT 
121 TTGTCTGATT TTAGTAGTAC CGGAAGGTTT ATTACGCCCA 
161 AGAACAGTGC TTGAATTGAG TTCTCGGGAC ACGGGAAAGA 
201 CAATGGAAGA AAAATTTACA TTCAGTAGCC TTATATATGA 
241 AATGCTGCGA AGCCACGTCT TTATAAGTAG ATAATGTCCC 
281 ATGAGCTGAA CTATGGGAAT TTATGACGCA GTTCATTGTA 
321 TATATATTAC ATTAACTCTT TAGTTTAACA TCTGAATTGT 
361 TTTATAAAAT AACTTTTTGA ATTTTTTTAT GATCGCTTAG 
401 TTAAGTCTAT TATATCAGGT TTTTTCATTC ATCATAATTG 
441 TTCGTTAAAT ATGAGTATAT TTAAATACAG GAATTAGTAT 
481 CATTTGCAGT CACGAAAAGG GCCGTTTCAT AGAGAGTTTT 
521 CTTAATAAAG TTGAGGGTTT CCGTGATAGT TTTGAGGGGT 
561 TGTTTGAACT AGATTTACGC TTACCTTTGA ACTGATTAAT 
601 TTTTTCAGCG GGCTTATCAT AATCATCCAT CATAGCAGTC 
641 TTTCTGGACT TCGTCGAGGA CTGGCTTTCT GAATTTTGAC 
681 GGTCCCTATT AGCTCCAGTT GGAGGAATTG AGTTACCTAC 
721 AACTGGCAAG AGGTCTTTGT TTGGATTCAA AATAGGACTT 
761 TGTGGTAGCA GTTTGGTTTT ATTCAATCTA AAGATATGAG 
801 AAACAGGTTT TAAGTAAATC GATACTATTG TACCAATGTT 
841 TAGCTCCAAT TCCTCCAAAA CGGTGGGATC TAATTTTGTG 
8 81 TTCATTTCTA TTAGTGGCAA CTCTCCGTCC AGTACTGATT 
921 TTAAAGATTC AAAAGTTATC GCGTTTGATA TACGAGACGT 
961 TTTCGTTAAT GACAGCAATC TCCAATACAT CAGTGTTTTA 
1001 TCTCTTAAGT CAGGATTATT TTCGTGATCG GTGCATCCTT 
1041 TTAATAAATC CATACAAAGT TCTTCAGTTT CCTTTGTAGG 
1081 ATTTCTGATG AAGAATTTTA TTGCTGAGTT CAGAATGGAA 
1121 AATTGCACTT CTAGCGTCTC ATTAAACATG TTTGAGGAAA 
1161 AAACTCTAAA TAACTCCAGG TAGTTTGGAA TTACATCCGA 
1201 ATATTGCGTT ATTATCCAGA TCATAGCGTT TTTTGATTCA 
1241 GGTTCCTGTA CAACTTCAGT GTGTTTGACT AGTTCTGTTA 
1281 CGTTTGCTTT AAAATTATTG GGATATTTCC TCAAAATATT 
1321 TCTGAAAACC GAAATAATCT CCTGGACGAC ATAATCAACA 

13 61 CCGAATTCTA ACAAATCTAG TAGCACAGCG ACACAATCGT 
1401 GTACAGAGTC TTCATCTAGC TTAACAGCGA GATTACCAAT 
1441 GGCTCTGACT GATTTCCTTG ACATTTGAAT ATCAATATCT 

14 81 GTAGCATATT GTTCCAACTC TTCTAGAATT CTTGGTAATG 
1521 TTTCCTTGTT AGCTAAAAGA TATAAACACT CTAATTTCGT 
1561 GTCTTTGATG TATATGGGGT CATTGTACTC GATGAAAAAA 
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FIG.IB 

1601 TACGAAATGT CTAGCCTGAG TAGAGATGAC TCCCTACTCA 
1641 ATAAAAGAAG AATAACGTTT CTTAATACTA AAAATTGTAA 
1681 TTCAGGCGGC TTATCTAACA AAGCTATTAC AGAGTTAGAT 
1721 AGCTTTTCGG CTAGAGTTTC TTTGATGACG TCAACATAAT 
1761 TCAACAAGTA CATGATGAAT TTTAAAGAGT TCAACACTAC 
1801 GTATGTGTTT ACTTGTTGCA GGTACGGTAA AGCTAGTTCG 
1841 ATCATTTCAT GGGTATCCAA ATAATGCTGC GGCACAACCG 
1881 AAGTCGTCAA AACTTCCAAA ACAGTAGCCT TATTCCACTC 
1921 ATTTAATTCG GGTAAAAGTT CTAGCATGTC AAAAGCGAGT 
1961 TCCAAGGGAA TCCTGAAGGT TCCATGTTAG CGTTTTTTTC 
2001 GTGAATGGAA TATAAAGTAT GTAATGCAGC TACAATGACT 
2041 TCTGGAGAGC TCGACTGTGC CTTTACAATG TCATGTAGAA 
2 081 TGCTTGATAA CCCCAATACC CTTTCATGAT CAATTTCATC 
2121 TAAATCCAAC AGTGCGTAAA TTGCTGTCCT CGTCACTTGT 
2161 TCAGGTGGAG ACTTGTGATT TACCAATGAA ATGATACAGT 
2201 CGAAGGCCTG ATCAGATAGC TCTTTCACCG GGACTAATAC 
2241 CAGAGTTCTT AGTGCCATTA TTTGTAACTT TTCATCTCTG 
2281 CTTTTGAAAT CGTCCATTAT AAATGGCAAA GCCTCTCTGG 
2321 CCTGCTGAGG TTTTAATGCG CCGATCACCC TAATATACTC 
2361 ATGGCAAATT CTTTTCACTT CTAGATCATC TTCAATTTGC 
2401 CAAAATTTCA AGAGCTCAGA AAACAGAAGG GACATTTCGC 
2441 CATAGTTTCC TAGAAC CAAA TTGGCGATAA TTTTTCTCAG 
24 81 AGCATTTTTC CTTCTTGTTA TATTCGATTT AAACTTTTTT 
2521 ACTCCAAAAT GTTGCAGATC TGTGACGATT TCATTTGCTT 
2561 TATATCTGGC AAAAACTTTT TGATCGGACA TAAGCGAAAT 
2601 ACGTCCTATT AATGAAGTGA ATGTTCTTGC TGTATTCCCT 
2641 TCTTGTGCAG TAGATTAATT CTGTTTCCAG GCTGCGATAC 
2681 TTTGATACCC AATACTAAAA GTTGATGATT TGAACGATCT 
2721 CCTATTTCCT CGCACATTTT TGGAGCGATA CCCGGAAGAC 
2761 AGAATCGCGA TGTTAAGAAA ATAGTTCTGA TGGCACTAAA 
2801 GAGATCATGA TTAAGGAAAG GTAAGTGATA TGCATGAATG 
2 841 GGAATAGGCT TTCGAACTTG ACGATTTAGT TCCTTATTTC 
2881 TATCCATCTA ATCCTCCAAC TTCAATAGGC CTTATCTAGC 

2 921 TCAGAGCAGT ATTTAATTGA GAATAGTAGC TTAATTGAAA 
2961 CCTTACTAAA AAAGTGTATG GTTACATAAG ATAAGG CGTT 

3 001 AAGAAGAGTA TACATATGCA TTATTCATTA CCAAGACCAC 
3 041 TATGAATAGT AATACCATAT TTAGCTTTTG AAACTCATGT 
3 081 TTTCTATTGT GTTGTTTCAA ATTCCTCTGT TAGGCTCAAT 
3121 TTAGGTTAAT TAAATTATAA AAAAATATAA AAAATAAAGA 
3161 AAGTTTATCC ATCGGCACCT CAATTCAATG GAGTAAACAG 
3201 TTTCAACACT GAGTGGTGAA ACATTGAACA ACTACATGCA 
3241 GTTTCCCGCC ACGAGGCAAG TGTAGGTCCT TTGTCCATTT 
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FIG. IC 

3281 CGCTTTGTTT TGCAGGTCAT TGATGACCTA ATTAGGAAGG 
3321 TAGAAGCCGC TCCAGCTCAA TAAGGAAATG CTAAGGGTAC 
3 3 61 TCGCCTTTGG TGTTTTACCA TACAATGGCA GCTTTATGTC 
34 01 ACTTCATTCT TCAGTAACGG CGCTTAAATA TTCCCAAAAA 
3441 CGTTACAATG GAATTGTTTG ATCATGTAAC GAAATG CAAT 
3481 CTTCTAAAAA AAAAGCCATG TGAATCAAAA AAAGATTCCT 
3521 TTTAGCATAC TATAAATATG CAAAATGCCC TCTATTTATT 
3561 CTAGTAATCG TCCATTCTCA TATCTTCCTT ATATCAGTCG 
3 601 CCTCGCTTAA TATAGTCAGC ACAAAAGGAA CAACAATTCG 
3 641 CCAGTTTTCA AAATGTTCAC TTTTCTCAAA ATTATTCTGT 
3 681 GGCTTTTTTC CTTGGCATTG GCCTCTGCTA TAAATATCAA 
3721 CGATATCACA TTTTCCAATT TAGAAATTAC TCCACTGACT 
3761 GCAAATAAAC AACCTGATCA AGGTTGGACT GCCACTTTTG 
3801 ATTTTAGTAT TGCAGATGCG TCTTCCATTA GGGAGGGCGA 

3 841 TGAATTCACA TTATCAATGC CACATGTTTA TAGGATTAAG 
3881 CTATTAAACT CATCGCAAAC AGCTACTATT TCCTTAGCGG 
3921 ATGGTACTGA GGCTTTCAAA TGCTATGTTT CGCAACAGGC 
3961 TGCATACTTG TATGAAAATA CTACTTTCAC ATGTACTGCT 
4001 CAAAATGACC TGTCCTCCTA TAATACGATT GATGGATCCA 
4041 TAACATTTTC GCTAAATTTT AGTGATGGTG GTTCCAGCTA 
4081 TGAATATGAG TTAGAAAACG CTAAGTTTTT CAAATCTGGG 
4121 CCAATGCTTG TTAAACTTGG TAATCAAATG TCAGATGTGG 
4161 TGAATTTCGA TCCTGCTGCT TTTACAGAGA ATGTTTTTCA 
4201 CTCTGGGCGT TCAACTGGTT ACGGTTCTTT TGAAAGTTAT 
4241 CATTTGGGTA TGTATTGTCC AAACGGATAT TTCCTGGGTG 
4281 GTACTGAGAA GATTGATTAC GACAGTTCCA ATAACAATGT 
4321 CGATTTGGAT TGTTCTTCAG TTCAGGTTTA TTCATCCAAT 
4361 GATTTTAATG ATTGGTGGTT CCCGCAAAGT TACAATGATA 
4401 CCAATGCTGA CGTCACTTGT TTTGGTAGTA ATCTGTGGAT 
4441 TACACTTGAC GAAAAACTAT ATGATGGGGA AATGTTATGG 
4481 GTTAATGCAT TACAATCTCT ACCCGCTAAT GTAAACACAA 
4521 TAGATCATGC GTTAGAATTT CAATACACAT GCCTTGATAC 
4561 CATAGCAAAT ACTACGTACG CTACGCAATT CTCGACTACT 

4 601 AGGGAATTTA TTGTTTATCA GGGTCGGAAC CTCGGTACAG 
4 641 CTAGCGCCAA AAGCTCTTTT ATCTCAACCA CTACTACTGA 
46 81 TTTAACAAGT ATAAACACTA GTGCGTATTC CACTGGATCC 
4 721 ATTTCCACAG TAGAAACAGG CAATCGAACT ACATCAGAAG 
4761 TGATCAGTCA TGTGGTGACT ACCAGCACAA AACTGTCTCC 
4801 AACTGCTACT ACCAGCCTGA CAATTGCACA AACCAGTATC 
4 841 TATTCTACTG ACTCAAATAT CACAGTAGGA ACAGATATTC 
48 81 ACACCACATC AGAAGTGATT AGTGATGTGG AAACCATTAG 
4 921 CAGAGAAACA GCTTCGACCG TTGTAGCCGC TCCAACCTCA 
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FIG. ID 

4961 ACAACTGGAT GGACAGGCGC TATGAATACT TACATCCCGC 
50 01 AATTTACATC CTCTTCTTTC GCAACAATCA ACAGCACACC 
5 041 AATAATCTCT TCATCAGCAG TATTTGAAAC CTCAGATGCT 
5 081 TCAATTGTCA ATGTGCACAC TGAAAATATC ACGAATACTG 
5121 CTGCTGTTCC ATCTGAAGAG CCCACTTTTG TAAATGCCAC 
5161 GAGAAACTCC TTAAATTCCT TCTGCAGCAG CAAACAGCCA 
52 01 TCCAGTCCCT CATCTTATAC GTCTTCCCCA CTCGTATCGT 
5241 CCCTCTCCGT AAGCAAAACA TTACTAAGCA C CAGTTTTAC 
5281 GCCTTCTGTG CCAACATCTA ATACATATAT CAAAACGGAA 
5321 AATACGGGTT ACTTTGAGCA CACGGCTTTG ACAACATCTT 
5361 CAGTTGGCCT TAATTCTTTT AGTGAAACAG CACTCTCATC 
5401 TCAGGGAACG AAAATTGACA CCTTTTTAGT GTCATCCTTG 
5441 ATCGCATATC CTTCTTCTGC ATCAGGAAGC CAATTGTCCG 
5481 GTATCCAACA GAATTTCACA TCAACTTCTC TCATGATTTC 
5521 AACCTATGAA GGTAAAGCGT CTATATTTTT CTCAGCTGAG 
5561 CTCGGTTCGA TCATTTTTCT GCTTTTGTCG TACCTGCTAT 
5601 TCTAAAACGG GTACTGTACA GTTAGTACAT TGAGTCGAAA 
5641 TATACGAAAT TATTGTTCAT AATTTTCATC CTGGCTCTTT 
5681 TTTTCTTCAA CCATAGTTAA ATGGACAGTT CATATCTTAA 
5721 ACTCTAATAA TACTTTTCTA GTTCTTATCC TTTTCCGTCT 
5761 CACCGCAGAT TTTATCATAG TATTAAATTT ATATTTTGTT 
5801 CGTAAAAAGA AAAATTTGTG AGCGTTACCG CTCGTTTCAT 
5841 TACCCGAAGG CTGTTTCAGT AGACCACTGA TTAAGTAAGT 
5881 AGATGAAAAA ATTTCATCAC CATGAAAGAG TTCGATGAGA 
5921 GCTACTTTTT CAAATGCTTA ACAGCTAACC GCCATTCAAT 
5 961 AATGTTACGT TCTCTTCATT CTGCGGCTAC GTTATCTAAC 
6001 AAGAGGTTTT ACTCTCTCAT ATCTCATTCA AATAGAAAGA 
6041 ACATAATCAA AAAGCTT 6057 
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FIG.8A 

1 AATTCGGCAC GAGATTCCTT TGATTTGCAA CTGTTAATCA 
41 TGGTTTCCAA AAGCTTTTTT TTGGCTGCGG CGCTCAACGT 
81 AGTGGGCACC TTGGCCCAGG CCCCCACGGC CGTTCTTAAT 
121 GGCAACGAGG TCATCTCTGG TGTCCTTGAG GGCAAGGTTG 
161 ATACCTTCAA GGGAATCCCA TTTGCTGACC CTCCTGTTGG 
201 TGACTTGCGG TTCAAGCACC CCCAGCCTTT CACTGGATCC 
241 TACCAGGGTC TTAAGGCCAA CGACTTCAGC TCTGCTTGTA 
281 TGCAGCTTGA TCCTGGCAAT GCCTTTTCTT TGCTTGACAA 
321 AGTAGTGGGC TTGGGAAAGA TTCTTCCTGA TAACCTTAGA 
361 GGCCCTCTTT ATGACATGGC CCAGGGTAGT GTCTCCATGA 
401 ATGAGGACTG TCTCTACCTT AACGTTTTCC GCCCCGCTGG 
441 CACCAAGCCT GATGCTAAGC TCCCCGTCAT GGTTTGGATT 
481 TACGGTGGTG CCTTTGTGTT TGGTTCTTCT GCTTCTTACC 
521 CTGGTAACGG CTACGTCAAG GAGAGTGTGG AAATGGGCCA 
561 GCCTGTTGTG TTTGTTTCCA TCAACTACCG TACCGGCCCC 
601 TATGGATTCT TGGGTGGTGA TGCCATCACC GCTGAGGGCA 
641 ACACCAACGC TGGTCTGCAC GACCAGCGCA AGGGTCTCGA 
681 GTGGGTTAGC GACAACATTG CCAACTTTGG TGGTGATCCC 
721 GACAAGGTCA TGATTTTCGG TGAGTCCGCT GGTGCCATGA 
761 GTGTTGCTCA CCAGCTTGTT GCCTACGGTG GTGACAACAC 
801 CTACAACGGA AAGCAGCTTT TCCACTCTGC CATTCTTCAG 
841 TCTGGCGGTC CTCTTCCTTA CTTTGACTCT ACTTCTGTTG 
881 GTCCCGAGAG TGCCTACAGC AGATTTGCTC AGTATGCCGG 
921 ATGTGACACC AGTGCCAGTG ATAATGACAC TCTGGCTTGT 
961 CTCCGCAGCA AGTCCAGCGA TGTCTTGCAC AGTGCGCAGA 
1001 ACTCGTATGA TCTTAAGGAC CTGTTTGGTC TGCTCCCTCA 
1041 ATTCCTTGGA TTTGGTCCCA GACCCGACGG CAACATTATT 
1081 CCCGATGCCG CTTATGAGCT CTACCGCAGC GGTAGATACG 
1121 CCAAGGTTCC CTACATTACT GGCAACCAGG AGGATGAGGG 
1161 TACTATTCTT GCCCCCGTTG CTATTAATGC TACCACTACT 

12 01 CCCCATGTTA AGAAGTGGTT GAAGTACATT TGTAGCCAGG 
1241 CTTCTGACGC TTCGCTTGAT CGTGTTTTGT CGCTCTACCC 
1281 CGGCTCTTGG TCGGAGGGTT CACCATTCCG CACTGGTATT 
1321 CTTAATGCTC TTACCCCTCA GTTCAAGCGC ATTGCTGCCA 

13 61 TTTTCACTGA TTTGCTGTTC CAGTCTCCTC GTCGTGTTAT 

14 01 GCTTAACGCT ACCAAGGACG TCAACCGCTG GACTTACCTT 
1441 GCCACCCAGC TCCATAACCT CGTTCCATTT TTGGGTACTT 
14 81 TCCATGGCAG TGATCTTCTT TTTCAATACT ACGTGGACCT 
1521 TGGCCCATCT TCTGCTTACC GCCGCTACTT TATCTCGTTT 
1561 GCCAACCACC ACGACCCCAA CGTTGGTACC AACCTCCAAC 
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FIG.8B 

1601 AGTGGGATAT GTACACTGAT GCAGGCAAGG AGATGCTTCA 
1641 GATTCATATG ATTGGTAACT CTATGAGAAC TGACGACTTT 
1681 AGAATCGAGG GAATCTCGAA CTTTGAGTCT GACGTTACTC 
1721 TCTTCGGTTA ATCCCATTTA GCAAGTTTTG TGTATTTCAA 
1761 GTATACCAGT TGATGTAATA TATCAATAGA TTACAAATTA 
1801 ATTAGTGAAA AAAAAAAAAA AAAAAAAC 1828 
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FIG. MA 



1 ATGACAATGC CTCATCGCTA TATGTTTTTG GCAGTCTTTA 
41 CACTTCTGGC ACTAACTAGT GTGGCCTCAG GAGCCACAGA 
81 GGCGTGCTTA CCAGCAGGCC AGAGGAAAAG TGGGATGAAT 
121 ATAAATTTTT ACCAGTATTC ATTGAAAGAT TCCTCCACAT 
161 ATTCGAATGC AGCATATATG GCTTATGGAT ATGCCTCAAA 
2 01 AACCAAACTA GGTTCTGTCG GAGGACAAAC TGATATCTCG 
241 ATTGATTATA ATATTCCCTG TGTTAGTTCA TCAGGCACAT 
281 TTCCTTGTCC T CAAGAAG AT TCCTATGGAA ACTGGGGATG 
321 CAAAGGAATG GGTGCTTGTT CTAATAGTCA AGGAATTGCA 
361 TACTGGAGTA CTGATTTATT TGGTTTCTAT ACTACCCCAA 
4 01 CAAACGTAAC CCTAGAAATG ACAGGTTATT TTTTACCACC 
441 ACAGACGGGT TCTTACACAT TCAAGTTTGC TACAGTTGAC 
481 GACTCTGCAA TTCTATCAGT AGGTGGTGCA ACCGCGTTCA 
521 ACTGTTGTGC TCAACAGCAA CCGCCGATCA CATCAACGAA 
561 CTTTACCATT GACGGTATCA AGCCATGGGG TGGAAGTTTG 
601 CCACCTAATA TCGAAGGAAC CGTCTATATG TACGCTGGCT 
641 ACTATTATCC AATGAAGGTT GTTTACTCGA ACGCTGTTTC 
681 TTGGGGTACA CTTCCAATTA GTGTGACACT TCCAGATGGT 
721 ACCACTGTAA GTGATGACTT CGAAGGGTAC GTCTATTCCT 
761 TTGACGATGA CCTAAGTCAA TCTAACTGTA CTGTCCCTGA 
801 CCCTTCAAAT TATGCTGTCA GTACCACTAC AACTACAACG 
841 GAACCATGGA CCGGTACTTT CACTTCTACA TCTACTGAAA 
8 81 TGACCACCGT CACCGGTACC AACGGCGTTC CAACTGACGA 
921 AACCGTCATT GTCATCAGAA CTCCAACCAG TGAAGGTCTA 
961 ATCAGCACCA CCACTGAACC ATGGACTGGC ACTTTCACTT 
1001 CGACTTCCAC TGAGGTTACC ACCATCACTG GAACCAACGG 
1041 TCAACCAACT GACGAAACTG TGATTGTTAT CAGAACTCCA 
1081 ACCAGTGAAG GTCTAATCAG CACCACCACT GAACCATGGA 
1121 CTGGTACTTT CACTTCTACA TCTACTGAAA TGACCACCGT 
1161 CACCGGTACT AACGGTCAAC CAACTGACGA AACCGTGATT 
1201 GTTATCAGAA CTCCAACCAG TGAAGGTTTG GTTACAACCA 
1241 CCACTGAACC ATGGACTGGT ACTTTTACTT CGACTTCCAC 
1281 TGAAATGTCT ACTGTCACTG GAACCAATGG CTTGCCAACT 
1321 GATGAAACTG TCATTGTTGT CAAAACTCCA ACTACTGCCA 

13 61 TCTCATCCAG TTTGTCATCA TCATCTTCAG GACAAATCAC 

14 01 CAGCTCTATC ACGTCTTCGC GTCCAATTAT TACCCCATTC 
1441 TATC CTAGCA ATGGAACTTC TGTGATTTCT TCCTCAGTAA 
14 81 TTTCTTCCTC AGTCACTTCT TCTCTATTCA CTTCTTCTCC 
1521 AGTCATTTCT TCCTCAGTCA TTTCTTCTTC TACAACAACC 
1561 TCCACTTCTA TATTTTCTGA ATC AT CTAAA TCATCCGTCA 



U.S. Patent 



Feb. 22, 2000 



Sheet 20 of 24 



6,027,910 



FIG.IIB 

1601 TTCCAACCAG TAGTTCCACC TCTGGTTCTT CTGAGAGCGA 
1641 AACGAGTTCA GCTGGTTCTG TCTCTTCTTC CTCTTTTATC 
1681 TCTTCTGAAT CAT C AAAAT C TCCTACATAT TCTTCTTCAT 
1721 CATTACCACT TGTTACCAGT GCGACAACAA GCCAGGAAAC 
1761 TGCTTCTTCA TTACCACCTG CTACCACTAC AAAAACGAG C 
1801 G AACAAAC C A CTTTGGTTAC CGTGACATCC TGCGAGTCTC 
1841 ATGTGTGCAC TGAATCCATC TCCCCTGCGA TTGTTTCCAC 
1881 AGCTACTGTT ACTGTTAGCG GCGTCACAAC AGAGTATACC 
1921 ACATGGTGCC CTATTTCTAC TACAGAGACA ACAAAGCAAA 
1961 CCAAAGGGAC AACAGAGCAA ACCACAGAAA CAACAAAACA 
2001 AACCACGGTA GTTACAATTT CTTCTTGTGA ATCTGACGTA 
2041 TGCTCTAAGA CTGCTTCTCC AGCCATTGTA TCTACAAGCA 
2081 CTGCTACTAT TAACGGCGTT ACTACAGAAT ACACAACATG 
2121 GTGTCCTATT TCCACCACAG AATCGAGGCA ACAAACAACG 
2161 CTAGTTACTG TTACTTCCTG CGAATCTGGT GTGTGTTCCG 
2201 AAACTGCTTC ACCTGCCATT GTTTCGACGG CCACGGCTAC 
2241 TGTGAATGAT GTTGTTACGG TCTATCCTAC ATGGAGGCCA 
2281 CAGACTGCGA ATGAAGAGTC TGTCAGCTCT AAAATGAACA 
2321 GTGCTACCGG TGAGACAACA ACCAATACTT TAGCTGCTGA 
2361 AACGACTACC AATACTGTAG CTGCTGAGAC GATTAC CAAT 
2401 ACTGGAGCTG CTGAGACGAA AACAGTAGTC ACCTCTTCGC 
2441 TTTCAAGATC TAATCACGCT GAAACACAGA CGGCTTCCGC 
2481 GACCGATGTG ATTGGTCACA GCAGTAGTGT TGTTTCTGTA 
2521 TCCGAAACTG GCAACACCAA GAGTCTAACA AGTTCCGGGT 
2561 TGAGTACTAT GTCGCAACAG CCTCGTAGCA CACCAGCAAG 
2601 CAGCATGGTA GGATATAGTA CAGCTTCTTT AGAAATTTCA 
2641 ACGTATGCTG GCAGTGCAAC AGCTTACTGG CCGGTAGTGG 
2681 TTTAA 2685 
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FIG.I2 
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PROCESS FOR IMMOBILIZING ENZYMES 

TO THE CELL WALL OF A MICROBIAL 
CELL BY PRODUCING A FUSION PROTEIN 

The present invention is in the field of conversion 5 
processes using immobilized enzymes, produced by genetic 
engineering. 

BACKGROUND OF THE INVENTION 

In the detergent, personal care and food products industry 1C 
there is a strong trend towards natural ingredients of these 
products and to environmentally acceptable production pro- 
cesses. Enzymic conversions are very important for fulfill- 
ing these consumer demands, as these processes can be 
completely natural. Moreover enzymic processes are very 
specific and consequently will produce minimum amounts 
of waste products. Such processes can be carried out in 
water at mild temperatures and atmospheric pressure. How- 
ever enzymic processes based on free enzymes are either 
quite expensive due to the loss of enzymes or require 20 
expensive equipment, like ultra-membrane systems to entrap 
the enzyme. 

Alternatively enzymes can be immobilized either physi- 
cally or chemically. The latter method has often the disad- ^ 
vantage that coupling is carried out using non-natural chemi- 
cals and in processes that are not attractive from an 
environmental point of view. Moreover chemical modifica- 
tion of enzymes is nearly always not very specific, which 
means that coupling can affect the activity of the enzyme 30 
negatively. Physical immobilization can comply with con- 
sumer demands, however also physical immobilization may 
affect the activity of the enzyme in a negative way. 
Moreover, a physically immobilized enzyme is in equilib- 
rium with free enzyme, which means that in continuous 35 
reactors, according to the laws of thermodynamics, substan- 
tial losses of enzyme are unavoidable. 

There are a few publications on immobilization of 
enzymes to microbial cells (see reference 1). The present 
invention provides a method for immobilizing enzymes to 40 
cell walls of microbial cells in a very precise way. 
Additionally, the immobilization does not require any 
chemical or physical coupling step and is very efficient. 
Some extracellular proteins are known to have special 
functions which they can perform only If they remain bound 45 
to the cell wall of the host cell. Often this type of protein has 
a long C-terminal part that anchors it in the cell wall. These 
C-terminal parts have very special amino acid sequences. A 
typical example is anchoring via C-terminal sequences 
enriched in proline (see reference 2). Another mechanism to 50 
anchor proteins in cell walls is that the protein has a 
glycosyl -phosphatidyl-inositol (GPI) anchor (see reference 
3) and that the C-terminal part of the protein contains a 
substantial number of potential serine and threonine glyco- 
sylation sites. O-Glycosylation of these sites gives a rod-like 55 
conformation to the C-terminal part of these proteins. 
Another feature of these manno-proteins is that they seem to 
be linked to the glucan in the cell wall of lower eukaryotes, 
as they cannot be extracted from the cell wall with SDS, but 
can be liberated by glucanase treatment. go 

SUMMARY OF THE INVENTION 

The present invention provides a method for immobiliz- 
ing an enzyme, which comprises the use of recombinant 
DNA techniques for producing an enzyme or a functional 65 
part thereof linked to the cell wall of a host cell, preferably 
a microbial cell, and whereby the enzyme or functional 
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fragment thereof is localized at the exterior of the cell wall. 
Preferably the enzyme or the functional part thereof is 
immobilized by linking to the C-terminal part of a protein 
that ensures anchoring in the cell wall. 

In one embodiment of the invention a recombinant poly- 
nucleotide is provided comprising a structural gene encod- 
ing a protein providing catalytic activity and at least a part 
of a gene encoding a protein capable of anchoring in a 
eukaryotic or prokaryotic cell wall, said part encoding at 
least the C-terminal part of said anchoring protein. Prefer- 
ably the polynucleotide further comprises a sequence encod- 
ing a signal peptide ensuring secretion of the expression 
product of the polynucleotide. Such signal peptide can be 
derived from a glycosyl -phosphatidyl-inositol (GPI) anchor- 
ing protein, a-factor, a-agglutinin, invertase or inulinase, 
a-amylase of Bacillus, or a proteinase of lactic acid bacteria. 
The DNA sequence encoding a protein capable of anchoring 
in the cell wall can encode a-agglutinin, AGA1 
(a-agglutinin) FLOl (flocculation protein) or the Major Cell 
Wall Protein of lower eukaryotes, or a proteinase of lactic 
acid bacteria. The recombinant polynucleotide is operably 
linked to a promoter, preferably an inducible promoter. The 
DNA sequence encoding a protein providing catalytic activ- 
ity can encode a hydrolytic enzyme, e.g. a lipase, or an 
oxidoreductase, e.g. an oxidase. Another embodiment of the 
invention relates to a recombinant vector comprising a 
polynucleotide as described above. If such vector contains a 
DNA sequence encoding a protein providing catalytic 
activity, which protein exhibits said catalytic activity when 
present in a miUtimeric form, said vector can further com- 
prise a second polynucleotide comprising a structural gene 
encoding the same protein providing catalytic activity com- 
bined with a sequence encoding a signal peptide ensuring 
secretion of the expression product of said second 
polynucleotide, said second polynucleotide being operably 
linked to a regulatable promoter, preferably an inducible or 
repressible promoter. 

A further embodiment of the invention relates to a chi- 
meric protein encoded by a polynucleotide as described 
above. 

Still another embodiment is a host cell, preferably a 
microorganism, containing a polynucleotide as described 
above or a vector as described above. If the protein provid- 
ing catalytic activity exhibits said catalytic activity when 
present in a multimeric form, said host cell or microorgan- 
ism can further comprise a second polynucleotide compris- 
ing a structural gene encoding the same protein providing 
catalytic activity combined with a sequence encoding a 
signal peptide ensuring secretion of the expression product 
of said second polynucleotide, said second polynucleotide 
being operably linked to a regulatable promoter, preferably 
an inducible or repressible promoter, and said second poly- 
nucleotide being present either in another vector or in the 
chromosome of said microorganism. Preferably the host cell 
or microorganism has at least one of said polynucleotides 
integrated in its chromosome. As a result of culturing such 
host cell or microorganism the invention provides a host 
cell, preferably a microorganism, having a protein as 
described above immobilized on its cell wall. The host cell 
or microorganism can be a lower eukaryote, in particular a 
yeast. 

The invention also provides a process for carrying out an 
enzymatic process by using an immobilized catalytically 
active protein, wherein a substrate for said catalytically 
active protein is contacted with a host cell or microorganism 
according to the invention. 

BRIEF DESCRIPTION OF THE FIGURES 

FIG. 1: DNA sequence of the 6057 bp Hindlll fragment 
containing the complete AGcd gene of 5. cerevisiae (see 
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SEQ ID NO: 1). The position of the unique Nhel site and the 
Hindu I site used for the described constructions is specified 
in the header. 

FIG. 2: Schematic presentation of the construction of 
pUR2969. The restriction sites for endonucleases used are 5 
shown. Abbreviations used: AG-alpha- 1: Gene expressing 
a-agglutinin from S. cerevisiae 

amp: ^-lactamase resistance gene 

PGKp: pbosphoglyceratekinase promoter 

PGKt: terminator of the same gene. 

FIG. 3: a-Galactosidase activity of S. cerevisiae MT302/ 
1C cells and culture fluid transformed with pSY13 during 
batch culture: 

A: U/l a-galactosidase per time; the OD 530 is also shown 15 

B: a-galactosidase activity of free and bond enzyme 
expressed in U/OD 530 . 

FIG. 4: a-Galactosidase activity of S. cerevisiae MT302/ 
1C cells and culture fluid transformed with pUR2969 during 
batch culture: 20 

A: U/l a-galactosidase per time; the OD J30 is also shown 

B: a-galactosidase activity of free and bond enzyme 
expressed in U/OD S30 . 

FIG. 5: Western analysis with anti a-galactosidase serum ^ 
of extracellular fractions of cells of exponential phase 
(OD 530 =2). The analyzed fractions are equivalent to 4 mg 
cell walls, (fresh weight): 

A: MT302/1C expressing a-galactosidase, 

lane 1, growth medium 30 

lane 2, SDS extract of isolated cell walls 

lane 3, glucanase extract of SDS extracted cell walls; 

B: MT302/1C expressing a-Gal-AGal fusion protein, 
lane 1, growth medium 

lane 2, SDS extract of isolated cell walls 35 
lane 3, glucanase extract of SDS-extracted cell walls 
lane 4: Endo-H treated glucanase extract. 
FIG. 6: Immunofluorescent labelling (anti 
a-galactosidase) of MT302/1C cells in the exponential 
phase (OD 530 =2) expressing the a-Gal-a-agglutinin fusion 40 
protein. 

Phase micrograph.of intact cells A: overview B: detail. 

FIG. 7: Schematic presentation of the construction of 
pUR2970A, P UR2971A, pUR2972A, ,and pUR2973. The 
restriction sites for endonucleases used are indicated in the 45 
figure. PCR oligonucleotide sequences are mentioned in the 
text. 

Abbreviations used: AGal cds: coding sequence of 
a-agglutinin 

a-AGG-AGal: Gene expressing a-agglutinin from S. 
cerevisiae 

amp: lactamase resistance gene 

lipolase: lipase gene of Humicola 

a-MF: prepro-a-mating factor sequence 55 
Pgal7-GAL7: GAL7 promoter 
invSS: SUC2 signal sequence 
a-gal: a-galactosidase gene 

LEU2d: truncated promoter of LEU2 gene; go 
LEU2: LEU2 gene with complete promoter sequence. 
FIG. 8: DNA sequence of a fragment containing the 
complete coding sequence of lipase B of Geotrichum can- 
didum strain 335426 (see SEQ ID NO: 11). The sequence of 
the mature lipase B starts at nucleotide 97 of the given 65 
sequence. The coding sequence starts at nucleotide 40 
(ATG). 
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FIG. 9: Schematic presentation of the construction of 
pUR2975 and pUR2976. The restriction sites for endonu- 
cleases used are shown. Abbreviations used: 

a-AGG: Gene expressing a-agglutinin from 5. cerevisiae 

amp: P-lactamase resistance gene 

invSS: SUC2 signal sequence 

LEU2d: truncated promoter LEU2 gene 

Pgal7=GAL7: GAL7 promoter 

a-MF: prepro-a-mating factor sequence 

lipolase: lipase gene of Humicola 

lipaseB: lipaseB gene of Geotrichum candidum. 

FIG. 10: Schematic presentation of the construction of 
pUR2981 and pUR2982. The restriction sites for endonu- 
cleases used are shown. Abbreviations used: 

a-AGG-AG-alpba 1: Gene expressing a-agglutinin from 
5. cerevisiae 

mucor lipase: lipase gene of Rhizomucor miehei 

Pgal7»GAL7: GAL7 promoter 

a-MF: prepro-a-mating factor sequence 

amp: (J-lactamase resistance gene; 

2u: 2 //m sequence 

invSS: SUC2 signal sequence 

lipolase: lipase gene of Humicola 

LEU2d: truncated promoter LEU2 gene 

LEU2: LEU2 gene with complete promoter sequence. 

FIG. 11: DNA sequence (2685 bases) of the 894 amino 
acids coding part of the FLOl gene (see SEQ ID NO: 21), 
the given sequence starts with the codon for the first amino 
acid and ends with the stop codon. 

FIG. 12: Schematic presentation of plasmid pUR2990. 
Some restriction sites for endonucleases relevant for the 
given cloning procedure are shown. 

FIG. 13: Schematic presentation of plasmid pUR7034. 

FIG. 14: Schematic presentation of plasmid pUR2972B. 

FIG. 15: Immunofluorescent labelling (anti-lipolase) of 
SU10 cells in the exponential phase (OD J3O =0.5) expressing 
the lipolase/-a-agglutinin fusion protein. 

A: phase micrograph B: matching fluorescent micrograph 

DETAILED DESCRIPTION OF THE 
INVENTION 

The present invention provides a method for immobiliz- 
ing an enzyme, comprising immobilizing the enzyme or a 
functional part thereof to the cell wall of a host cell, 
preferably a microbial cell, using recombinant DNA tech- 
niques. In particular, the C-terminal part of a protein that 
ensures anchoring in the cell wall is linked to an enzyme or 
the functional part of an enzyme, in such a way that the 
enzyme is localized on or just above the cell surface. In this 
way immobilized enzymes are obtained on the surface of 
cells. The linkage is performed at gene level and is charac- 
terized in that the structural gene coding for the enzyme is 
coupled to at least part of a gene encoding an anchor-protein 
in such a way that in the expression product the enzyme is 
coupled at its C-terminal end to the C-terminal part of an 
anchor-protein. The chimeric enzyme is preferably preceded 
by a signal sequence that ensures efficient secretion of the 
chimeric protein. 

Thus the invention relates to a recombinant polynucle- 
otide comprising a structural gene encoding a protein pro- 
viding catalytic activity and at least a part of a gene encoding 
a protein capable of anchoring in a eukaryotic or prokaryotic 
cell wall, said part encoding at least the C-terminal part of 
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said anchoring protein. The length of the C-terminal part of 
the anchoring protein may vary. Although the entire struc- 
tural protein could be used, it is preferred that only a part is 
used, leading to a more efficient exposure of the enzyme 
protein in the medium surrounding the cell. The anchoring 5 
part of the anchoring protein should preferably be entirely 
present. As an example, about the C-terminal half of the 
anchoring protein could be used. Preferably, the polynucle- 
otide further comprises a sequence encoding a signal peptide 
ensuring secretion of the expression product of the poly- 10 
nucleotide. The signal peptide can be derived from a GPI 
anchoring protein, a-factor, a-agglutinin, invertase or 
inulinase, a-amylase of Bacillus, or a proteinase of lactic 
acid bacteria. The protein capable of anchoring in the cell 
wall is preferably selected form the group of a-agglutinin, 15 
AGA1, FLOl (flocculation protein) or the Major Cell Wall 
Protein of lower eukaryotes, or a proteinase of lactic acid 
bacteria. The polynucleotide of the invention is preferably 
operably linked to a promoter, preferably a regulatable 
promoter, especially an inducible promoter. 20 

The invention also relates to a recombinant vector con- 
taining the polynucleotide as described above, and to a host 
cell containing this polynucleotide, or this vector. In a 
particular case, wherein the protein providing catalytic 
activity exhibits said catalytic activity when present in a 25 
multimeric form, such as may be the case with 
oxidoreductases, dimerisation or multimerisation of the 
monomers might be a prerequisite for activity. The vector 
and/or the host cell can then further comprise a second 
polynucleotide comprising a structural gene encoding the 
same protein providing catalytic activity combined with a 
sequence encoding a signal peptide ensuring secretion of the 
expression product of said second polynucleotide, said sec- 
ond polynucleotide being operably linked to a regulatable 
promoter, preferably an inducible or repressible promoter. 
Expression and secretion of the second polynucleotide after 
expression and secretion of the first polynucleotide will then 
result in the formation of an active mul timer on the exterior 
of the cell wall. The host cell or microorganism preferably 
contains the polynucleotide described above, or at least one 
of said polynucleotides in the case of a combination, inte- 
grated in its chromosome. 

The present invention relates in particular to lower 
eukaryotes like yeasts that have very stable cell walls and 
have proteins that are known to be anchored in the cell wall, 
e.g. a-agglutinin or the product of gene FLOl. Suitable 
yeasts belong to the genera Candida, Debaryomyces, 
Hansenula, Kluyveromyces, Pichia and Saccharomyces. 
Also fungi, especially Aspergillus, Penicillium and Rhizo- 
pus can be used. For certain applications also prokaryotes 
are applicable. 

For yeasts the present invention deals in particular with 
genes encoding chimeric enzymes consisting of: 

a. the signal sequence e.g. derived from the a-factor-, the 
invertase-, the a-agglutinin- or the inulinase genes; 

b. structural genes encoding hydrolytic enzymes such as 
a-galactosidase, proteases, peptidases, pectinases, 
pectylesterase, rhamnogalacturonase, esterases and 
lipases, or non-hydrolytic enzymes such as oxidases; go 
and 

c. the C-terminus of typically cell wall bound proteins 
such as a-agglutinin (see reference 4), AGA1 (see 
reference 5) and FLOl (see the non-prior published 
reference 6). 65 

The expression of these genes can be under the control of 
a constitutive promoter, but more preferred are regulatable, 
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i.e. repressible or inducible promoters such as the GAL7 
promoter for Saccharomyces, or the inulinase promoter for 
Kluyveromyces or the methanol-oxidase promoter for 
Hansenula. 

Preferably the constructs are made in such a way that the 
new genetic information is integrated in a stable way in the 
chromosome of the host cell. 

The invention further relates to a host cell, in particular a 
microorganism, having the chimeric protein described above 
immobilized on its cell wall. It further concerns the use of 
such microorganisms for carrying out an enzymatic process 
by contacting a substrate for the enzyme with the microor- 
ganism. Such a process may be carried out e.g. in a packed 
column, wherein the microorganisms may be supported on 
solid particles, or in a stirred reactor. The reaction may be 
aqueous or non-aqueous. Where necessary, additives neces- 
sary for the performance of the enzyme, e.g. a co-factor, may 
be introduced in the reaction medium. 

After repeated usage of the naturally immobilized enzyme 
system in processes, the performance of the system may 
decrease. This is caused either by physical denaturation or 
by chemical poisoning or detachment of the enzyme. A 
particular feature of the present invention is that after usage 
the system can be recovered from the reaction medium by 
simple centrifugation or membrane filtration techniques and 
that the thus collected cells can be transferred to a recovery 
medium in which the cells revive quickly and concomitantly 
produce the chimeric protein, thus ensuring that the surface 
of the cells will be covered by fully active immobilized 
enzyme. This regeneration process is simple and cheap and 
therefore will improve the economics of enzymic processes 
and may result in a much wider application of processes 
based on immobilized enzyme systems. 

However, by no means the present invention is restricted 
to the reusability of the immobilized enzymes. 

The invention will be illustrated by the following 
examples without the scope of the invention being limited 
thereto. 

EXAMPLE 1 

Immobilized a-Galactosidase/a-Agglutinin on the 
Surface of 5. cerevisiase. 

The gene encoding a-agglutinin has been described by 
Lipke et al. (see reference 4). The sequence of a 6057 bp 
Hindlll insert in pTZ18R, containing the whole AGal gene 
is given in FIG. 1. The coding sequence expands over 650 
amino acids, including a putative signal sequence starting at 
nucleotide 3653 with ATG. The unique Nhel site cuts the 
DNAat position 988 of the given coding sequence within the 
coding part of amino acid 330, thereby separating the 
a-agglutinin into an N-terminal and a C-terminal part of 
about same size. 

Through digestion of pUR2968 (see FIG. 2) with Nhel/ 
Hindlll a 1.4 kb fragment was released, containing the 
sequence information of the putative cell wall anchor. For 
the fusion to a-galactosidase the plasmid pSY16 was used, 
an episomal vector based on YEplac 181, harboring the 
a-galactosidase sequence preceded by the SUC2 invertase 
signal sequence and placed between the constitutive PGK 
promoter and PGK terminator. The Styl site, present in the 
last nine base-pairs of the open reading frame of the 
a-galactosidase gene, was ligated to the Nhel site of the 
AGal gene fragment. To ensure the in frame fusion, the Styl 
site was filled in and the 5' overhang of the Nhel site was 
removed, prior to ligation into the Styl/Hindlll digested 
pSY13 (see FIG. 2). 
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To verify the correct assembly of the new plasmid, the 
shuttle vector was transformed into E. coli JM109 (recAl 
supE44 endAl hsdR17 gyrA96 relAl thi A(lac-proAB) P 
[traD36 proAB* lacl* lacZAM15j) (see reference 7) by the 
transformation protocol described by Chung et al. (see 5 
reference 8). One of the positive clones, designated 
pUR2969, was further characterized, the DNA isolated and 
purified according to the Quiagen protocol and subsequently 
characterized by DNA sequencing. DNA sequencing was 
mainly performed as described by Sanger et al. (see refer- 10 
ence 9), and Hsiao (see reference 10), here with the Seque- 
nase version 2.0 kit from United States Biochemical 
Company, according to the protocol with T7 DNA poly- 
merase (Amersham International pic) and [ 35 S]dATPaS 
(Amersham International pic: 370 MBq/ml; 22 TBq/mmol). 15 

This plasmid was then transformed into S. cerevisiae 
strain MT302/1C according to the protocol from Klebe et al. 
(see reference 11). 

Yeast transformants were selected on selective plates, 
lacking leucine, on with 40 /d (20 mg/ml DMF). X-a-Gal 20 
(5-bromo-4-chloro-3-indolyI-a-D-glucose, Boehringer 
Mannheim) was spread, to directly test for a-galactosidase 
activity (see reference 12). To demonstrate the expression, 
secretion, localization and activity of the chimeric protein 
the following analyses were performed: 25 
1. Expression and Secretion 

S. cerevisiae strain MT302/1C was transformed with 
either plasmid pSY13 containing the a-galactosidase gene 
of Cyamopsis tetragonoloba or plasmid pUR2969 contain- 
ing the a-galactosidase/a-agglutinin fusion construct. Dur- 30 
ing batch culture a-galactosidase activities were determined 
for washed cells and growth medium. The results are given 
in FIG. 3 and FIG. 4. The a-galactosidase expressed from 
yeast cells containing plasmid pSY13 was almost exclu- 
sively present in the growth medium (FIG. 3 A), whereas the 35 
a-galactosidase-a-agglutinin fusion protein was almost 
exclusively cell associated (FIG. 4A). Moreover, the 
immobilized, cell wall-associated, a-galactosidase-a- 
agglutinin fusion enzyme had retained the complete activity 
over the whole incubation time, while the secreted and 40 
released enzyme lost about 90% of the activity after an 
incubation of 65 hours. This indicates, that the immobiliza- 
tion of the described enzyme into the cell wall of yeast 
protects the enzyme against inactivation, presumably 
through proteinases, and thereby increases the stability sig- 45 
nificantly. Further insight into the localization of the differ- 
ent gene products was obtained by Western analysis. 
Therefore, cells were harvested by centrifugation and 
washed in 10 mM Tris.HCl, pH 7.8; 1 mM PMSF at 0° C. 
and all subsequent steps were performed at the same tern- 50 
perature. Three ml isolation buffer and 10 g of glass beads 
were added per gram of cells (wet weight). The mixture was 
shaken in a Griffin shaker at 50% of its maximum speed for 
30 minutes. The supernatant was isolated and the glass beads 
were washed with 1M NaCl and 1 mM PMSF until the 55 
washes were clear. The supernatant and the washes were 
pooled. The cell walls were recovered by centrifugation and 
were subsequently washed in 1 mM PMSF. 

Non-covalently bound proteins or proteins bound through 
disulphide bridges were released from cell walls by boiling 60 
for 5 minutes in 50 mM Tris.HCl, pH 7.8; containing 2% 
SDS, 100 mM EDTA and 40 mM P-mercaptoethanol. The 
SDS -extracted cell walls were washed several limes in 1 
mM PMSF to remove SDS. Ten mg of cell walls (wet 
weight) were taken up in 20 1 100 mM sodium acetate, pH 65 
5.0, containing 1 mM PMSF. To this, 0.5 mU of the 
(3-13-glucanase ( Laminar ase; Sigma L5144) was used as a 
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source of pM,3-glucanase) was added followed by incuba- 
tion for 2 hours at 37° C. Subsequently another 0.5 mU of 
B-l,3-glucanase was added, followed by incubation for 
another 2 hours at 37° C. 

Proteins were denatured by boiling for 5 minutes preced- 
ing Endo-H treatment. Two mg of protein were incubated in 
1 ml 50 mM potassium phosphate, pH 5.5, containing 100 
mM p-mercaptoethanol and 0.5 mM PMSF with 40 mU 
Endo-H (Boehringer) for 48 hours at 37° C. Subsequently 20 
mU Endo-H were added followed by 24 hours of incubation 
at 37° C. 

Proteins were separated by SDS-PAGE according to 
Laemmli (see reference 13) in 2.2.-20% gradient gels. The 
gels were blotted by electrophoretic transfer onto Immobilon 
polyvinylidene-difluoride membrane (Millipore) as 
described by Towbin et al. (see reference 14). In case of 
highly glycosylated proteins a subsequently mild periodate 
treatment was performed in 50 mM periodic acid, 100 mM 
sodium acetate, pH 4.5, for several hours at 4° C. All 
subsequent incubations were carried out at room tempera- 
ture. The blot was blocked in PBS, containing 0.5% gelatine 
and 0.5% Tween-20, for one hour followed by incubation for 
1 hour in probe buffer (PBS, 0.2% gelatine, 0.1% Tween-20) 
containing 1:200 diluted serum. The blot was subsequently 
washed several times in washing buffer (PBS; 0.2% gelatine; 
0.5% Tween-20) followed by incubation for 1 hour in 
probe-buffer containing 125 I-labelled protein A (Amersham). 
After several washes in washing buffer, the blot was air- 
dried, wrapped in Saran (Dow) and exposed to X-omat S 
film (Kodak) with intensifying screen at -70° C. An Omni- 
media 6cx scanner and the Adobe Photoshop programme 
were used to quantify the amount of labelled protein. The 
results of the various protein isolation procedures from both 
transformants are given in FIG. 5. While for the transfor- 
mants comprising the pSY13 plasmid the overall mass of the 
enzyme was localized in the medium, with only minor 
amounts of enzyme more entrapped than bond in the cell 
wall (FIG. 5A)— -which could completely be removed by 
SDS extraction — the fusion protein was tightly bound to the 
cell wall; with only small amounts of a-galactosidase/a- 
agglutinin delivered into the surrounding culture fluid or 
being SDS extractable. In contrast to the laminarinase 
extraction of cell walls from cells expressing the free 
a-galactosidase, where no further liberation of any more 
enzyme was observed, identical treatment of fusion enzyme 
expressing cells released the overall bulk of the enzyme. 
This indicates that the fusion protein is intimately associated 
with the cell wall glucan in S. cerevisiae, like a-agglutinin, 
while a-galactosidase alone is not. The subsequently per- 
formed EndoH treatment showed a heavy glycosylation of 
the fusion protein, a result, entirely in agreement with the 
described extended glycosylation of the C-terminal part of 
a-agglutinin. 
2. Localization 

Immunofluorescent labelling with anti-a-galactosidase 
serum was performed on intact cells to determine the 
presence and distribution of a-galactosidase/a-agglutinin 
fusion protein in the cell wall. Immunofluorescent labelling 
was carried out without fixing according to Watzele et al. 
(see reference 15). Cells of OD 530 -2 were isolated and 
washed in TBS (10 mM Tris.HCl, pH 7.8, containing 140 
mM NaCI, 5 mM EDTA and 20^g/ml cycloheximide). The 
cells were incubated in TBS+ anti-a-galactosidase serum for 
1 hour, followed by several washings in TBS. A subsequent 
incubation was carried out with FITC-conjugated anti-rabbit 
IgG (Sigma) for 30 minutes. After washing in TBS, cells 
were taken up in 10 mM Tris.HCl, pH 9.0, containing 1 
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mg/ml p-phenylenediamine and 0.1% azide and were pho- 
tographed on a Zeiss 68000 microscope. The results of these 
analysis are given in FIG. 6, showing clearly that the 
chimeric a-galactosidase/a-agglutinin is localized at the 
surface of the yeast cell. Buds of various sizes, even very 
small ones very uniformly labelled, demonstrates that the 
fusion enzyme is continuously incorporated into the cell 
wall throughout the cell cycle and that it instantly becomes 
tightly linked. 

3. Activity 

To quantitatively assay a-galactosidase activity, 200 [A 
samples containing 0.1M sodium -acetate, pH 4.5 and 10 
mM p-nitrophenyl-a-D-galactopyranoside (Sigma) were 
incubated at 37° C. for exactly 5 minutes. The reaction was 15 
stopped by addition of 1 ml 2% sodium carbonate. From 
intact cells and cell walls, removed by centrifugation and 
isolated and washed as described, the a-galactosidase activ- 
ity was calculated using the extinction coefficient of 
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N.B. The essence of this EXAMPLE was published 
during the priority year by M. P. Schreuder et al. (see 
reference 25). 



EXAMPLE 2A 



Immobilized Humicola Lipase/a- Agglutinin on the 
Surface of S. cerevisiae. (inducible expression of 
immobilized enzyme system) 



The construction and isolation of the 1.4 kb Nhel/Hindlll 
fragment containing the C-terminal part of a-agglutinin has 
been described in EXAMPLE 1. Plasmid pUR7021 contains 
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p-nitrophenol of 18.4 cmfymole at 410 nm. One unit was 20 an 894 bp long synthetically produced DNA fragment 



defined as the hydrolysis of 1 /nnole substrate per minute at 
37° C. 

TABLE 1 

Distribution of free and immobilized a-galactosidase activity in 
yeast cells 

a-Galactosidase activity 
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Expressed 


Growth 


frits ct 


Isolated 


protein 


medium 


cells 


cell walls 


a-galactosidase 


14.7 


0.37 


0.01 


aGal/aAGG fusion protein 


0.54 


13.3 


10.9 
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Transformed MT302/1C cells were in exponential phase (OD JW - 2). One 
unit is defined as the hydrolysis of 1 //mole of p-nitrophenyl-a-D- 
galactopyranoside per minute at 37° C 
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encoding the lipase of Humicola (see reference 16 and SEQ 
ID NO: 7 and 8), cloned into the EcoRI/Hindlll restriction 
sites of the commercially available vector pTZ18R (see FIG. 
7). For the proper one-step modification of both the 5' end 
and the 3' end of the DNA part coding for the mature lipase, 
the PCR technique can be applied. Therefore the DNA 
oligonucleotides lipol (see SEQ ID NO: 3) and lipo2 (see 
SEQ ID NO: 6) can be used as primers in a standard PCR 
protocol, generating an 826 bp long DNA fragment with an 
EagI and a Hindlll restriction site at the ends, which can be 
combined with the larger part of the Eagl/Hindlll digested 
pUR2650, a plasmid containing the a-galactosidase gene 
preceded by the invertase signal sequence as described 
earlier in this specification, thereby generating plasmid 
pUR2970A (see FIG. 7). 



The results are summarized in Table 1. While the overall 
majority of a-galactosidase was distributed in the culture 40 
fluid, most of the fusion product was associated with the 
cells, primarily with the cell wall. Taking together the results 
shown in FIGS. 3 to 6 and in Table 1, it could be calculated 
that the enzymatic a-galactosidase activity of the chimeric 

enzyme is as good as that of the free'enzyme. Moreover, signal sequence and the N-terminus of lipase. 



PCR oligonucleotides for the in-frame linkage of Humi- 
cola lipase and the C-terminus of a agglutinin. 



a: PCR oligonucleotides for the transition between SUC2 



>mature lipase 
EagI E V s Q D L F 
primer lipol: 5' -GOG G CG GCC GA G GTC TCG CAA GAT CTG GA-3 ' 

III III III III III III II 
lipase: 3'-TAA GCA GCT CTC CAG AGC GTT CTG GAC CTG TTT-5 ' 
(non-coding strand, see SEQ ID NO: 4) 

during stationary phase, the activity of the a-galactosidase b: PCR oligonucleotides for the in frame transition 
in the growth medium decreased, whereas the activity of the between C-terminus of lipase and C-terminal part of 
cell wall associated a-galactosidase a-agglutinin fusion a-agglutinin. 



PGLIGTCL 
lipase S'-TTC GGG TTA ATT GGG ACA TGT CTT TAG TGC GA-3' 

(cod.. strand) ||| ||| |]| |(| ||| ||| ||| ||| ||| || 

primer 3 ' -CCC AAT TAA CCC TGT ACA GAA CGA TCG , GAA TTC GAACCCC-5' 

lipo2: Nhel Hindi I I 

(for the part of the lipase coding strand see SEQ ID NO: 5) 



remained constant, indicating that the cell associated fusion 65 Through the PCR method a Nhel site will be created at the 
protein is protected from inaclivation or proteolytic degra- end of the coding sequence of the lipase, allowing the 
dation. in-frame linkage between the DNA coding for lipase and the 
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DNA coding for the C-terminal part of a-agglutinin. Plas- amino acids. This fragment was exchanged against a small 

mid pUR2970A can then be digested with Nhel and Hindlll DNA fragment, generated through the hybridisation of the 

and the 1.4 kb Nbel/HindlH fragment containing the two chemically synthesized deoxyoligonucleotides SEO ID 

C-terminal part of a-aggiutinin from plasmid pUR2968 can NO: 9 and SEQ ID NO: 10. After annealing of both DNA 

be combined with the larger part of Nhel and Hindlll treated 5 strands , these two oligonucleotides essentially reconstruct 

plasmid pURfSWA, resulting <m plasmid P UR2971A^From ^ rest of the 3 - codi s ce of ^ mitia i v 

tfus plasmid the 22 kb Eagl/HmdIII fragment can be iso- but additionaUy downstrcam of the u ^ g * ne a 

lated and ligated into the EagI- and Hind II I -treated „™ r «tl t ; nn c.' t * p„n™,^ u u- jut • i 

~irD«v7/ii „.k u„ i*—- a ttoVtxm ■ a * ♦* c new NheI resection site, followed by a Hindlll site in close 

pUR2741, whereby plasmid pUR2741 is a derivative of „•„•..•. , u u *u « \ .l • c , »™ . • 

PUR2740 (see reference 17), where the second EagI restric- 10 ? C J' J V , nucleotides of the Nhel site 

tion site in the already inactive Tet resistance gene was fom ! . thc ^ &r ^ lasl 'T^ll^ u PaSC - ^ 

deleted through Nrul/Sall digestion. The Sail site was filled r f sultm * P iasmid was desi §nated pUR2970B. Subsequently, 

in prior to religation. The ligation then results in pUR2972A ^ C0nstructl0n intermediate was digested with EagI and 

containing the GAL7 promoter, the invertase signal NheL ±e h ^ encodmg fragment was isolated, and, 
sequence, the chimeric lipase/a-agglutinin gene, the 2 /mi is t0 S ether ^ th « 14 ^ Nhel/Hindlll fragment of pUR2968 

sequence, the defective Leu2 promoter and the Leu2 gene. n 8 ated inl ° the Ea g*- Hindlll-cut pSYl vector. The 

This plasmid can be used for transforming S. cerevisiae and outcome of this 3-point-ligation was called pUR2972B (see 

the transformed cells can be cultivated in YP medium FIG - tne final lipolase-a-agglutinin yeast expression 

containing galactose as an inducer without repressing vector. 

amounts of glucose being present, which causes the expres- 20 This plasmid was used for transforming S. cerevisiae 

sion of the chimeric lipase/a-agglutinin gene. stra { n suiO as described in reference 17 and the transformed 

The expression, secretion, localization and activity of the cells were cultivated in YP medium containing galactose as 

chimeric upase/a-agglutinin can be analyzed using similar the inducer without repressing amounts of glucose being 

procedures as given in EXAMPLE 1. present, which causes the expression of the chimeric lipase/ 

In a similar way variants of Humicola lipase, obtained via 25 a-agglutinin gene. 

rDNA techniques, can be linked to the C-terminal part of 3 Activity 

a-agglutinin, which variants can have a higher stability _ . ,. 

during (inter)esterification processes. To *J Mnto * ^ h P ase actmtv > **> activit y measurements 

with two separate substrates were performed. In both cases, 

EXAMPLE 2B 30 SU10 yeast cells transformed with either plasmid pUR7034 

t .... . „ . , r . , a 1 .• • tU or P SY1 served as control. Therefore, yeast cell transfor- 

Immobilized Humicola Lip ase/a-Agglutmin on the , , . . ... . .. ew * Tm -^-^ 

c, * r c .. /. j ... f mants containing either plasmid pSYl or plasmid pUR7034 

Surface of S. cerevisiase (inducible expression of , T ,n%m-.Ti / . • 

immobilized enzyme system) 0r plasmid P UR2972B were U P for 24 h in YNB- 

glucose medium supplied with histidine and uracil, then 

EXAMPLE 2A describes a protocol for preparing a 35 diluted 1:10 in YP-medium supplied with 5% galactose, and 

particular construct. Before carrying out the work it was again cultured. After 24 h incubation at 30° C, a first 

considered more convenient to use the expression vector measU rement for both assays was performed, 

described in EXAMPLE 1, so that the construction route _ _ . , 

given in this EXAMPLE 2B differs on minor points from the ^ first assa y a PP hcd was the P H stat method Wl£hm 

construction route given in EXAMPLE 2A and the resulting 40 this assa y> one umt of h P asc activit y * defined ^ ^ c amonat 

plasmids are not identical to those described in EXAMPLE of eDz y mc capable of liberatmg one micromole of fatty acid 

2A. However, the essential gene construct comprising the P er minute from a triglyceride substrate .under standard 

promoter, signal sequence, and the structural gene encoding conditions (30 ml assay solution containing 38 mM 

the fusion protein are the same in EXAMPLES 2Aand 2B. ouve °&> considered as pure trioleate, emulsified with 1:1 

1. Construction ' " 45 w Av gum arabic, 20 mM calcium chloride, 40 mM sodium 

The construction and isolation of the 1.4 kb Nhel/Hindlll chloride, 5 mM Tris, pH 9.0, 30° C.) in a radiometer pH stat 

fragment encoding the C-terminal part of a-agglutinin cell apparatus (pHM 84 pH meter, ABU 80 autoburette, TTA 60 

wall protein has been described in EXAMPLE 1. The titration assembly). The fatty acids formed were titrated with 

plasmid pUR7033 (resembling pUR7021 of EXAMPLE 0.05N NaOH and the activity measured was based on alkali 

2A) was made by treating the commercially available vector 50 consumption in the interval between 1 and 2 minutes after 

pTZ18R with EcoRI and Hindlll and ligating the resulting addition of putative enzyme batch. To test for immobilized 

vector fragment with an 894 bp long synthetically produced ii pase activity, 1 ml of each culture was centrifuged, the 

DNA EcoRI/Hindin fragment encoding the lipase of Humi- supernatant was saved, the pellet was resuspended and 

cola (see SEQ ID NO: 7 and 8, and reference 16). washed ^ x ml 1M ntbitoU subsequently again centrifuged 

For the fusion of the lipase to the C-terminaL cell wall 55 and resuspended in 200^1 1M sorbitol. From each type of 

anchor-comprising domain of a-agglutinin, plasmid ycast cell the first supernatant and the washed cells were 

pUR7033 was digested with EagI and Hindlll, and the lipase tested f or jj pase activity 

coding sequence was isolated and ligated into the EagI- and . ^ ' „ . 

Hindni-digested yeast expression vector pSYl (see refer- A- UpSU * y Z n ^/mi) 
ence 27), thereby generating pUR7034 (see FIG. 13). This is 60 
a 2 fim episomal expression vector, containing the 
a-galactosidase gene described in EXAMPLE 1, preceded 
by the invertase (SUC2) signal sequence under the control of 
the inducible GAL7 promoter. 

Parallel to this digestion, pUR7033 was also digested with 65 
EcoRV and Hindlll, thereby releasing a 57 bp long DNA 
fragment, possessing codons for the last 15 carboxyterminal 
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24.1 


632.0 


pUR2972B-(l) 


18.7 
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B: Lipase activity after 48 h (LU/ml) 
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The rest of the yeast cultures was further incubated, and io 
essentially the same separation procedure was done after 48 
hours. Dependent on the initial activity measured, the actual 
volume of the sample measured deviated between 25 /d and 
150 /d. 

This series of measurements indicates, that yeast cells 15 
comprising the plasmid coding for the lipase -a-agglutinin 
fusion protein in fact express some lipase activity which is 
associated with the yeast cell. 

An additional second assay was performed to further 
confirm the immobilization of activity of lipase on the yeast 
cell surface. Briefly, within this assay, the kinetics of the 
PNP (=paranitrophenyl) release from PNP-butyrate is deter- 
mined by measurement of the OD at 400 nm. Therefore, 10 
ml cultures containing yeast cells with either pSYl, 
pUR7034 or pUR2972B were centrifuged, the pellet was 
resuspended in 4 ml of buffer A (0.1M NaOAc, pH 5.0 and 25 
1 mM PMSF), from this 4 ml 500 /d was centrifuged again 
and resuspended in 500 ^al PNB-buffer (20 mM Tris-HCl, pH 
9.0, 20 mM CaC12, 25 mM NaCl), centrifuged once again, 
and finally resuspended in 400 /d PNB buffer. This fraction 
was used to determine the cell bound fraction of lipase. 30 

The remaining 3500 /d were spun down, the pellet was 
resuspended in 4 ml A, to each of this, 40 /d laminarinase (ex 
mollusc, 1.25 mU//d) was added and first incubated for 3 
hours at 37° C, followed by an overnight incubation at 20° 
C. Then the reaction mixture, still containing intact cells, 35 
were centrifuged again and the supernatant was used to 
determined the amount of originally cell wall bound material 
released through laminarinase incubation. The final pellet 
was resuspended in 400 /d PNP buffer, to calculate the still 
cell associated part. The blank reaction of a defined volume 
of specific culture fraction in 4 ml assay buffer was 
determined, and than the reaction was started through addi- 
tion of 80 (x\ of substrate solution (100 mM PNP-butyrate in 
methanol), and the reaction was observed at 25° C. at 400 
nm in a spectrophotometer. 
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wall was also confirmed through Immunofluorescent label- 
ling with anti-lipolase serum essentially as described in 
EXAMPLE 1, item 2. Localization. 

As can be seen in FIG. 15, the Immunofluorescent stain 
shows essentially an analogous picture as the 
a-galactosidase immuno stain, with clearly detectable reac- 
tivity on the outside of the cell surface (see FIG. 15 A 
showing a clear halo around the cells and FIG. B showing 
a lighter circle at the surface of the cells), but neither in the 
medium nor in the interior of the cells. Yeast cells expressing 
pUR2972B, the Humicola lipase -a-agglutinin fusion 
protein, become homogeneously stained on the surface, 
indicating the virtually entire irmnobiLization of a chimeric 
enzyme with an a-agglutinin C-terminus on the exterior of 
a yeast cell. In the performed control experiment SU10 yeast 
cells containing plasmid pUR7034 served as a control and 
here, no cell surface bound reactivity against the applied 
anti-lipase serum could be detected. 

In a similar way variants of Humicola lipase, obtained via 
rDNA techniques, can be linked to the C-terminal part of 
a-agglutinin, which variants can have a higher stability 
during (inter)esterification processes. 

EXAMPLE 3 

Immobilized Humicola Lipase/a-Agglutinin on the 
Surface of 5. cerevisiae (constitutive expression of 
immobilized enzyme system) 

Plasmid pUR2972 as described in EXAMPLE 2 can be 
treated with EagI and Hindlll and the about 2.2 kb fragment 
containing the lipase/a-agglutinin gene can be isolated. 
Plasmid pSY16 can be restricted with EagI and Hindlll and 
between these sites the 2.2 kb fragment containing the 
lipase/a-agglutinin fragment can be ligated resulting in 
pUR2973. The part of this plasmid that is involved in the 
production of the chimeric enzyme is similar to pUR2972 
with the exception of the signal sequence. Whereas 
pUR2972 contains the SUC2-invertase-signal sequence, 
pUR2973 contains the a-mating factor signal sequence (see 
reference 18). Moreover the plasmid pUR2973 contains the 
Leu2 marker gene with the complete promoter sequence, 
instead of the truncated promoter version of pUR2972. 

EXAMPLE 4 

Immobilized Geotrichum Lipase/a-Agglutinin on 
the Surface of S. cerevisiae 

The construction and isolation of the 1.4 kb Nhel/Hindlll 
fragment comprising the C-terminal part of AGa-1 





cell bound 


activity in 


laminarinase 


laminarinase 






activity * 


the medium 


extract 


extracted cells 


OD660 


pSYl 


0.001 (116/d) 


0.001 


0.028 


0.000 


26 


pUR7034 


0.293 (220 /A) 


0.446 


0.076 


0585 


2.36 


P UR2972B-{1) 


0.494 (143 jul) 


0.021 


0.170 


0.208 


2.10 



•unless otherwise mentioned, the volume of enzyme solution added was 20 ui 



This result positively demonstrates that a significant 
amount of lipase activity is immobilized on the surface yeast 
cell, containing plasmid pUR2972B. Here again, incorpo- 
ration took place in such a way, that the reaction was 
catalyzed by cell wall inserted lipase of intact cells, indi- 
cated into the exterior orientated immobilization. 50 
Furthermore, the release of a significant amount of lipase 
activity after incubation with laminarinase again demon- 
strates the presumably covalent incorporation of a heterolo- 
gous enzyme through gene fusioo with the C-terminal part 
of a-agglutinin. 

3. Localization 65 

The expression, secretion, and subsequent incorporation 
of the lipase-a-agglutinin fusion protein into the yeast cell 



(a-agglutinin) gene has been described in EXAMPLE 1. For 
the in-frame gene fusion of the DNA coding for the 
C-terminal membrane anchor of a-agglutinin to the com- 
plete coding sequence of Geotichum candidum lipase B 
from strain CMICC 335426 (see FIG. 8 and SEQ ID NO: 11 
and 12), the plasmid pUR2974 can be used. This plasmid, 
derived from the commercially available pBluescript II SK 
plasmid, contains the cDNA coding for the complete G. 
candidum lipase II on an 1850 bp long EcoRI/XhoI insert 
(see FIG. 9). 

To develop an expression vector for S. cerevisiae with 
homologous signal sequences, the N-terminus of the mature 
lipase B was determined experimentally by standard tech- 



6,027,910 

15 16 

mques. The obtained amino acid sequence of "Gin-Ala-Pro- The Humicola lipase-a-agglutinin fusion protein coding 
Thr-Ala-Val../' is in complete agreement with the cleavage sequence can be exchanged against the lipase B/a- 

?eL°ence 19) ** " °' " { ** agglutinin fiiSi ° n COnStruCt describ<d above b ? ^g^ 00 of 

c # . , "- , .. L „ the described vector pUR2973 with Eagl/HindlH, resulting 

For the fusion of the mature lipase B to the S. cerevisiae 5 m p UR2976 (see FIG 9) 

signal sequences of SUC2 (invertase) or a-mating factor 

(prepro-aMF) on one hand and the in-frame fusion to the 3' 

part of the AGal gene PCR technique can be used. The PCR EXAMPLE 5 

primer lipo3 (see SEQ ID NO: 13) can be constructed in 

such a way, that the originally present EagI site in the 5'-part , 0 i mmn uiY„*A du ^ ~ ■ l • r • / 

of the coding sequence (spanning codons 5-7 of the marure 10 Immobilized Rhaomucor mtehet Upase/a- 

protein) will become inactivated without any alteration in Agglutinin on the Surface of 5. cerevisiae 

the amino acid sequence. To facilitate the subsequent clon- 
ing procedures, the PCR primer can further contain a new The construction and isolation of the 1.4 kb Nhel/Hindlll 
EagI site at the 5' end, for the in-frame ligation to SUC2 fragment encoding the C-terminal part of a-agglutinin has 
signal sequence or prepro-aMF sequence, respectively. The 15 been described in EXAMPLE 1. The plasmid pUR2980 
corresponding PCR primer lipo4 (see SEQ ID NO: 16) contains a 1.25 kb cDNA fragment cloned into the Smal site 
contains an extra Nhel site behind the nucleotides coding for of commercially available pUC18, which (synthetically 
the C-terminus of lipase B, to ensure the proper fusion to the synthesizable) fragment encodes the complete coding 
C-terminal part of a-agglutinin. sequence of triglyceride lipase of Rhizomucor miehei (see 

PCR oligonucleotides for the in frame linkage of G. 20 reference 20), an enzyme used in a number of processes to 
candidum lipase U to the SUC2 signal sequence and the interesterify triacylglycerols (see reference 21) or to prepare 
C-terminal part of ct-aggtutinin. biosurfactants (see reference 22). Beside the 269 codons of 

a: N-terminal transition to either prepro aMF sequence or the mature lipase molecule, the fragment also harbours 
SUC2 signal sequence. codons for the 24 amino acid signal peptide as well as 70 

EagI AQAPRPSLN 
primer lipo3 : 5 ' -GOG GCG GCC GCG CAG GCC CCA AGG CGG TCT CTC AAT-3 ' 

II III III I I I II III III III III 
lipase II : 3 ' -GAC CGG GTC CGG GGT GCC GCC AGA GAG TTA-5 ' 

(non-cod. strand, see SEQ ID NO: 14) ) 

b: C-terminal fusion to C part of a-agglutinin amino acids of the propeptide. PCR can easily be applied to 

SNFETDVNLYG 
lipase: 5 ' -CA AAC TTT GAG ACT GAC GTT AAT CTC TAC GGT TAA AAC-3 ' 

(cod. strand) | ||| |J| ||| ||| ||| ||j ||| ||| ||j 

primer lipo4: 3'-C TGA CTG CAA TTA GAG ATG CCA CGATCG CCCC-5' 

Nhel 

(for the part of the lipase coding strand see SEQ ID NO: 15) 

40 

The PCR product with the modified ends can be generated ensure the proper fusion of the gene fragment encoding the 
by standard PCR protocols, using instead of the normal mature lipase to the SUC2 signal sequence or the prepro 
Ampli-Taq polymerase the new thermostable VENT a-mating factor sequence of S. cerevisiae, as well as the 
polymerase, which also exhibits proofreading activity, to 45 in-frame fusion to the described Nhel/Hindlll fragment. The 
ensure an error-free DNA template. Through digestion of the following two primers, IipoS (see SEQ ID NO: 17) and lipo6 
formerly described plasmid pUR2972 with EagI (complete) (see SEQ ID NO: 20), will generate a 833 bp DNA fragment, 
and Nhel (partial), the Humicola lipase fragment can be which after Proteinase K treatment and digestion with EagI 
exchanged against the DNA fragment coding for lipase B, and Nhel can be cloned as an 816 bp long fragment into the 
thereby generating the final S. cerevisiae expression vector 50 Eagl/Nhel digested plasmids pUR2972 and pUR2973, 
pUR2975 (see FIG. 9). respectively (see FIG. 7). 



EagI A S I D G G I 
lipo5: 5"-CCC G CG GCC GC G AGC ATT GAT GGT GGT ATC-3" 

III III III III III III 
lipase (non-cod. strand): 3'-TCG TAA CTA GCA CCA TAG-5 ' 

(for the part of the lipase non-coding strand, see SEQ ID NO: 18) 

N T G L C T 
lipase (cod. strand): 5'-AAC ACA GGC CTC TGT ACT-3 ' 

ill III III III III III 
Lipo6: 3"-TTG TGT CCG GAG ACA TGA CGATCGC GCC-5 ' 

Nhel 

(for the part of the lipase coding Btrand, see SEQ ID NO: 19) 

65 
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These new S. cerevisiae expression plasmids contain the cells can be removed by centrifugation and their 

GAL7 promoter, the invertase signal sequence (pUR2981) a-galactosidase activity/g biomass can be measured Cen- 

or the prepro-ct-mating factor sequence (pUR2982), the trifugates with a good activity can be used in a subsequent 

chimeric Rhizomucor miehei lipase/a-agglutinin gene, the 2 conversion process, whereas centrifugates with an activity 

sequence, the defective (truncated) Leu2 promoter and 5 of less then 50% of the original activity can be resuscitated 

the Leu2 gene. These plasmids can be transformed into S. in the growth medium and the cells can be allowed to 

cerevisiae and grown and analyzed using protocols recover for 2 to 4 hours. Thereafter the cells can be 

described in earlier EXAMPLES. centrifuged, washed and subsequently be used in a subse- 



EXAMPLE 6 jo 



quent conversion process. 

EXAMPLE 8 



Immobilized Aspergillus niger Glucose Oxidase/ 
GPI Anchored Cell Wall Proteins on the Surface of Production of Biosurfactants Using Humicola 

S. cerevisiae Lipase/ct-Agglutinin Immobilized on Yeasts. 

Glucose oxidase (p-D:oxygen 1-oxidoreductase, EC 15 The y east transformed with plasmid pUR2972 or 
1.1.3.4) from Aspergillus niger catalyses the oxidation of P^R2973 can be cultivated on large scale. At regular imer- 
P-D-glucose to gIucono-5-lactone and the concomitant vals during cultivation the washed cells can be analyzed on 
reduction of molecular oxygen to hydrogen peroxide. The 106 presence of lipase activity on their surface with methods 
fungal enzyme consists of a homodimer of molecular weight described in EXAMPLE 2. When both cell density and 
150,000 containing two tightly bound FAD co-factors. 20 lipase/biomass reache their maximum, the yeast cells can be 
Beside the use in glucose detection kits the enzyme is useful collected by centrifugation and washed. The washed cells 
as a source of hydrogen peroxide in food preservation. The cao 1x5 sus P ended a sma U amount of water and added to 
gene was cloned from both cDNA and genomic libraries, the a reactor tank containing a mix of fatty acids, preferably of 
single open reading frame contains no intervening sequences a cnam between 12 ~^ carbon atoms and sugars, 

and encodes a protein of 605 amino acids (see reference 23) 25 P refera bly glucose, galactose or sucrose. The total concen- 

With the help of two proper oligonucleotides the coding If?** ^I"^ 6 ?* WalCf ^ ?f CeIk) 

part of the sequence is adjusted in a one-step modifying ™« ht be bcl ° W The ft ^ al ?^ ntr f on °/ *! ^ 

procedure by PGR in such a way that a fusion gene product ^ T™? ° J J** ? «fc ? 

will be obtained coding for glucose oxidase and the „ ^^T"^^^^^^^^ 

C-terminal cell waU anchor of the FLOl gene product or 30 ™ almos P here ° f ^ and C0 2 in order to avoid oxidation of 

a-agglutinin. Tnus, some of the plasmids described in ^unsitoXed) fatty acids and to minimize the metabolic 

former EXAMPLES can be utilized to integrate the corre- tT^T? * "? ?"* 

sponding sequence in-frame between one of the signal ^ bc b i tWCen ^ ^ depending on type of fatty 

sequences used in the EXAMPLES and the Nhel/HindHI « nc^l^ZTl o f ™ A ^ 

part of the AGal gene methods and after 95% conversion of these fatty 

,. .7 ' . acids the yeasts cells can be removed by centrifugation and 

Since dimensation of the two monomers might be a their Up ase activity/g biomass can be measured. Centrif- 
prerequisite for activity, in an alternative approach the ugates ^ a good activit y can be used in a subsequent 
complete coding sequence for glucose oxidase without the conversion process, whereas centrifugates with an activity 
GPI anchor can be expressed m S. cerevisiae transformant 40 of less then 50% of the original activity can be resuscitated 
which already contains the fusion construct. This can be ^ the growth medium aod the cdk caQ be alIowed tQ 
fulfilled by constitutive expression of the fusion construct recover for 2 t0 8 hours. Thereafter the cells can be centri- 
contaimng the GPI anchor with the help of the GAPDH or foged again, washed and used in a subsequent conversion 
PGK promoter for example. The unbound not-anchored process, 
monomer can be produced by using*' a DNA construct 45 

comprising an inducible promoter, as for instance the GAL7 EXAMPLE 9 

promoter. 

Production of Special Types of Triacylglycerols 
EXAMPLE 7 using Rhizomucor miehei Lipase/a- Agglutinin 

Immobilized on Yeasts. 
Process to Convert Raffinose, Stachyose and c 

Similar Sugars in Soy Extracts with a- J}^! tiSi t t raQsfo ? ncd ^ plasmid P UR2981 or 

Galactosidase/a-Agglutinin Immobilized on Yeasts PUR2982 can be cultivated on a large scale. At regular 

intervals during cultivation the washed cells can be analyzed 
The yeast transformed with plasmid pUR2969 can be on the presence of lipase activity on their surface with 
cultivated on large scale. At regular intervals during culti- 55 methods described in EXAMPLE 1. When both cell density 
vation the washed cells should be analyzed on the presence and lipase/biomass reach their maximum, the yeast cells can 
of a-galactosidase activity on their surface with methods be collected by centrifugation and washed. The washed cells 
described in EXAMPLE 1. When both cell density and can be suspended in a small amount of water and can be 
a-galactosidasc activity/biomass reach their maximum, the added to a reactor tank containing a mix of various triacylg- 
yeast cells can then be collected by centrifugation and 60 lycerols and fatly acids. The total concentration of the water 
washed. The washed cells can then be added to soy extracts. (excluding the water in the yeast cells) might be below 0.1%. 
The final concentration of the yeast cells can vary between The final concentration of the yeast cells can vary between 
0.1 and 10 g/1, preferably the concentration should be above 0.1 and 10 g/1, preferably the concentration is above 1 g/1. 
1 g/1 The temperature of the soy extract should be <8° C. to The tank has to be kept under an atmosphere of N 2 and C0 2 
reduce the metabolic activity of the yeast cells. The conver- 65 in order to avoid oxidation of the (unsaturated) fatty acids 
sion of raffinose and stachyose can be analyzed with HPLC and to minimize the metabolic activity of the yeasts. The 
methods and after 95% conversion of these sugars the yeasts temperature of mixture in the tank should be between 
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30-70° C, depending on types of triacylglycerol and fatty 
acid used. The degree of interesterification can be analyzed 
with GLC/MS methods and after formation of at least 80% 
of the theoretical value of the desired type of triacylglycerol 
the yeasts cells can be removed by centrifugation and their 
lipase activity/g biomass can be measured. Centrifugates 
with a good activity can be used in a subsequent conversion 
process, whereas centrifugates with an activity of less then 
50% of the original activity is resuscitated in the growth 
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primers pcrflol (see SEQ ID NO: 23) and pcrflo2 (see SEQ 
ID NO: 26) Nhel and HindlH sites can be introduced at both 
ends of the DNA fragment. In a second step, the 1.4 kb 
Nhel/Hindlll fragment present in pUR2972 (either A or B) 
containing the C-terminal part of a-agglutinin can be 
replaced by the 1.9 kb DNA fragment coding for the 
C-terminal part of the FLOl protein, resulting in plasmid 
pUR2990 (see FIG. 12), comprising a DNA sequence encod- 
ing (a) the invertase signal sequence (SUC2) preceding (b) 



medium and the cells should be allowed to recover 2 to 8 10 the fusion protein consisting of (b.l) the lipase of Humicola 



hours. After that the cells can be centrifuged, washed and 
used in a subsequent interesterification process. 

Baker's yeasts of strain MT302/1C, transformed with 
either plasmid pSYl3 or plasmid pUR2969 (described in 
EXAMPLE 1) were deposited under the Budapest Treaty at 



(see reference 16) followed by (b.2) the C- terminus of FLOl 
protein (aa 271-894). 

PCR oligonucleotides for the in frame connection of the 
genes encoding the Humicola lipase and the C-terminal 
part of the FLOl gene product. 



S N Y A V S T 
primer pcrflol 5'- GAATTC GOT AGC AAT TAT GCT GTC AGT AAC - 3' 

NheI HI HI Ml HI III III 
FLOl gene (non-coding strand) 3'- AGT TTA ATA CGA CAG TCA TGG TGA - 5' 

(for the part of the non-coding strand, see SEQ ID NO: 24) 

FLOl coding strand S ' -AATAA AATTCGCGTTCTT TTTACG - 3' 

_ IMIIIIIIIIIIIIIII! 

primer pcrflo2: 3 ' -TTAAGCGCAAGAAAAATGC TTCGAAC TCGAG - 5' 

HindlH 

(for the part of the coding strand, see SEQ ID NO: 25) 



the Centraalbureau voor Schimmelcultures (CBS) on Jul. 3, 
1992 under provisional numbers 330.92 and 329.92, respec- 
tively. 30 

EXAMPLE 10 

Immobilized Humicola Lipase/FLOl Fusion on the 
Surface of S. cerevisiae 

35 

Flocculation, defined as "the (reversible) aggregation of 
dispersed yeast cells into floes" (see reference 24), is the 
most important feature of yeast strains in industrial fermen- 
tations. Beside this it is of principal interest, because it is a 
property associated with cell wall proteins and it is a 40 
quantitative characteristic. One of the genes associated with 
the flocculation phenotype in S. cerevisiae is the FLOl gene. 
The gene is located at approximately 24 kb from the right 
end of chromosome I and the DNA sequence of a clone 
containing major parts of FLOl gene has very recently been 45 
determined (see reference 26). The sequence is given in FIG. 
11 and SEQ ID NO: 21 and 22. The cloned fragment 
appeared to be approximately 2 kb shorter than the genomic 
copy as judged from Southern and Northern hybridizations, 
but encloses both ends of the FLOl gene. Analysis of the 50 
DNA sequence data indicates that the putative protein con- 
tains at the N-terminus a hydrophobic region which confirms 
a signal sequence for secretion, a hydrophobic C-terminus 
that might function as a signal for the attachment of a 
GPI-anchor and many glycosylation sites, especially in the 55 
C-terminus, with 46.6% serine and threonine in the arbi- 
trarily defined C-terminus (aa 271-894). Hence, it is likely 
that the FLOl gene product is localized in an orientated 
fashion in the yeast cell wall and may be directly involved 
in the process of interaction with neighbouring cells. The 60 
cloned FLOl sequence might therefore be suitable for the 
immobilization of proteins or peptides on the cell surface by 
a different type of cell wall anchor. 

Recombinant DNA constructs can be obtained, for 
example by utilizing the DNA coding for amino acids 65 
271-894 of the FLOl gene product, i.e. polynucleotide 
811-2682 of FIG. 11. Through application of two PCR 



Plasmid pUR2972 (either A or B) can be restricted with 
Nhel (partial) and HindlH and the Nhel/Hindlll fragment 
comprising the vector backbone and the lipase gene can be 
ligated to the correspondingly digested PCR product of the 
plasmid containing the FLOl sequence, resulting in plasmid 
pUR2990, containing the GAL7 promoter, the S. cerevisiae 
invertase signal sequence, the chimeric lipase/FLOl gene, 
the yeast 2 fan sequence, the defective Leu2 promoter and 
the Leu2 gene. This plasmid can be transformed into S. 
cerevisiae and the transformed cells can be cultivated in YP 
medium including galactose as inductor. 

The expression, secretion, localization and activity of the 
chimeric lipase/FLOl protein can be analyzed using similar 
procedures as given in Example 1. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION! 

(iii) NUMBER OF SEQUENCES: 26 



(2) INFORMATION FOR SEQ ID 110:1 1 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60S7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi> ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cerevisiae 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3653.. 5605 

(D) OTHER INFORMATION : /function- -sexual agglutinisation" 
/product- "alpha-agglutinin" 

, ' \ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
AAGCTTTAGG TAAGGGAGGC AGGGGGAAAA GATACTGAAA TGACGGAAAA CGAGAATATG 60 
GAGCAGGGAG CAACTTTTAG AGCTTTACCC GTTAAAAGGT CAAATCGAGG CTTCCTGCCT 120 
TTGTCTGATT TTAGTAGTAC CGGAAGGTTT ATTACGCCCA AGAACAGTGC TTGAATTGAG 180 
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TTCTCGGGAC ACGGGAAAGA CAATGGAAGA AAAATTTACA TTCAGTAGCC TTATATATGA 24 0 
AATGCTGCCA AGCCACGTCT TTATAAGTAG ATAATGTCCC ATGAGCTGAA CTATGGGAAT 30 0 
TTATGACGCA GTTCATTGTA TATATATTAC ATTAACTCTT TAGTTTAACA TCTGAATTGT 360 
TTTATAAAAT AACTTTTTGA ATTTTTTTAT GATCGCTTAG TTAAGTCTAT TATATCAGGT 420 
TTTTTCATTC ATCATAATTG TTCGTTAAAT ATGAGTATAT TTAAATACAG GAATTAGTAT 480 
CATTTGCAGT CACGAAAAGG GCCGTTTCAT AGAGAGTTTT CTTAATAAAG TTGAGGGTTT 540 
CCGTGATAGT TTTGAGGGGT TGTTTGAACT AGATTTACGC TTACCTTTCA ACTGATTAAT 600 
TTTTTCAGCG GGCTTATCAT AATCATCCAT CATAGCAGTC TTTCTGGACT TCGTCGAGGA 660 
CTGGCTTTCT GAATTTTGAC GGTCCCTATT AGCTCCAGTT GGAGGAATTG AGTTACCTAC 720 
AACTGGCAAG AGGTCTTTGT TTGGATTCAA AATAGGACTT TGTGGTAGCA GTTTGGTTTT 780 
ATTCAATCTA AAGATATGAG AAACAGGTTT TAAGTAAATC GATACTATTG TACCAATGTT 840 
TAGCTCCAAT TCCTCCAAAA CGGTGGGATC TAATTTTGTG TTCATTTCTA TTAGTGGCAA 900 
CTCTCCGTCC AGTACTGATT TTAAAGATTC AAAAGTTATC GCGTTTGATA TACGAGACGT 960 
TTTCGTTAAT GACAGCAATC TCCAATACAT CAGTGTTTTA TCTCTTAAGT CAGGATTATT 1020 
TTCGTCATCG GTGCATCCTT TTAATAAATC CATACAAAGT TCTTCAGTTT CCTTTGTAGG 1080 
ATTTCTGATG AAGAATTTTA TTGCTGAGTT CAGAATGGAA AATTGCACTT CTAGCGTCTC 1140 
ATTAAACATG TTTGAGGAAA AAACTCTAAA TAACTCCAGG TAGTTTGGAA TTACATCCGA 1200 
ATATTGCGTT ATTATCCAGA TCATAGCGTT TTTTGATTCA GGTTCCTGTA CAACTTCAGT 1260 
GTGTTTGACT AGTTCTGTTA CGTTTGCTTT AAAATTATTG GGATATTTCC TCAAAATATT 1320 
TCTGAAAACC GAAATAATCT CCTGGACGAC ATAATCAACA CCGAATTCTA ACAAATC TAG 1380 
TAGCACAGCG ACACAATCGT GTACAGAGTC TTCATCTAGC TTAACAGCGA GATTACCAAT 1440 
GGCTCTGACT GATTTCCTTG ACATTTGAAT ATCAATATCT GTAGCATATT GTTCCAACTC 1500 
TTCTAGAATT CTTGGTAATG TTTCCTTGTT AGCTAAAAGA TATAAACACT CTAATTTCGT 1560 
GTCTTTGATG TATATGGGGT CATTGTACTC GATGAAAAAA TACGAAATGT CTAGCCTGAG 1620 
TAGAGATGAC TCCCTACTCA ATAAAAGAAG AATAACGTTT CTTAATACTA AAAATTGTAA 1680 
TTCAGGCGGC TTATCTAACA AAGCTATTAC AGAGTTAGAT AGCTTTTCGG CTAGAGTTTC 1740 
TTTGATGACG TCAACATAAT TCAACAAGTA CATGATGAAT TTTAAAGAGT TCAACACTAC 1800 
GTATGTGTTT ACTTGTTGCA GGTACGGTAA AGCTAGTTCG ATCATTTCAT GGGTATCCAA 1860 
ATAATGCTGC GGCACAACCG AAGTCGTCAA AACTTCCAAA ACAGTAGCCT TATTCCACTC 1920 
ATTTAATTCG GGTAAAAGTT CTAGCATGTC AAAAGCGAGT TCCAAGGGAA TCCTGAAGGT 1980 
TCCATGTTAG CGTTTTTTTC GTGAATGGAA TATAAAGTAT GTAATGCAGC TACAATGACT 204 0 
TCTGGAGAGC TCGACTGTGC CTTTACAATG TCATGTAGAA TGCTTGATAA CCCCAATACC 2100 
CTTTCATGAT CAATTTCATC TAAATCCAAC AGTGCGTAAA TTGCTGTCCT CGTCACTTGT 2160 
TCAGGTGGAG ACTTGTGATT TACCAATGAA ATGATACAGT CGAAGGCCTG ATCAGATAGC 2220 
TCTTTCACCG GGACTAATAC CAGAGTTCTT AGTGCCATTA TTTGTAACTT TTCATCTCTG 2280 
CTTTTGAAAT CGTCCATTAT AAATGGCAAA GCCTCTCTGG CCTGCTGAGG TTTTAATGCG 2340 
CCGATCACCC TAATATACTC ATGGCAAATT CTTTTCACTT CTAGATCATC TTCAATTTGC 2400 
CAAAATTTCA AGAGCTCAGA AAACAGAAGG GACATTTCGC CATAGTTTCC TAGAACCAAA 2460 
TTGGCGATAA TTTTTCTCAG AGCATTTTTC CTTCTTGTTA TATTCGATTT AAACTTTTTT 2520 
ACTCCAAAAT GTTGCAGATC TGTGACGATT TCATTTGCTT TATATCTGGC AAAAACTTTT 2580 



25 
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TGATCGGACA TAAGCGAAAT ACGTCCTATT AATGAAGTGA ATGTTCTTGC TGTATTCCCT 2640 

TCTTGTGCAG TAGATTAATT CTGTTTCCAG GCTGCGATAC TTTGATACCC AATACTAAAA 2700 

GTTGATGATT TGAACGATCT CCTATTTCCT CGCACATTTT TGGAGCGATA CCCGGAAGAC 2760 

AGAATCGCGA T GTT AAG AAA ATAGTTCTGA TGGCACTAAA GAGATCATGA TTAAGGAAAG 2820 

GTAAGTGATA TGCATGAATG GGAATAGGCT TTCGAACTTG ACGATTTAGT TCCTTATTTC 2880 

T ATC CATC T A ATCCTCCAAC TTCAATAGGC CTTATCTAGC TCAGAGCAGT ATTTAATTGA 294 0 

GAATAGTAGC TTAATTGAAA CCTTACTAAA AAAGTGTATG GTTACATAAG ATAAGGCGTT 3000 

AAGAAGAGTA TACATATGCA TTATTCATTA CCAAGACCAC TATGAATAGT AAT AC CATAT 3060 

TTAGCTTTTG AAACTCATGT TTTCTATTGT GTTGTTTCAA ATTCCTCTGT TAGGCTCAAT 3120 

TTAGGTTAAT TAAATTATAA AAAAATATAA AAAATAAAGA AAG TTT ATC C ATCGGCACCT 3180 

CAATTCAATG GAGTAAACAG TTTCAACACT GAGTGGTGAA ACA TTG AAC A ACTACATGCA 3240 

GTTTCCCGCC ACGAGGCAAG TGTAGGTCCT TTGTCCATTT CGCTTTGTTT TGCAGGTCAT 3300 

TGATGACCTA ATTAGGAAGG TAGAAGCCGC TCCAGCTCAA TAAGGAAATG CTAAGGGTAC 3360 

TCGCCTTTGG TGTTTTACCA TACAATGGCA GCTTTATGTC ACTTCATTCT TCAGTAACGG 3420 

CGCTTAAATA TTCCCAAAAA CGTTACAATG GAATTGTTTG ATCATGTAAC GAAATGCAAT 3480 

CTTCTAAAAA AAAAGCCATG TGAATCAAAA AAAGATTCCT TTTAGCATAC TATAAATATG 3540 

CAAAATGCCC TCTATTTATT CTAGTAATCG TCCATTCTCA TATCTTCCTT ATATCAGTCG 3600 

CCTCGCTTAA TATAGTCAGC ACAAAAGGAA CAACAATTCG CCAGTTTTCA AA ATG 3655 



Mat 
1 



TTC ACT TTT CTC AAA ATT ATT CTG TGG CTT TTT TCC TTG GCA TTG GCC 
Phe Thr Phe Leu Lys He He Leu Trp Leu Phe Ser Leu Ala Leu Ala 
5 10 15 



3703 



TCT GCT ATA AAT ATC AAC GAT ATC ACA TTT TCC AAT TTA GAA ATT ACT 
Ser Ala He Asn He Asn Asp He Thr Phe Ser Asn Leu Glu He Thr 
20 25 30 



3751 



CCA CTG ACT GCA AAT AAA CAA CCT GAT CAA GGT TGG ACT GCC ACT TTT 
Pro Leu Thr Ala Asn Lys Gin Pro Aap Gin Gly Trp Thr Ala Thr Phe 
35 40 45 



3799 



GAT TTT AGT ATT GCA GAT GCG TCT TCC ATT AGG GAG GGC GAT GAA TTC 
Asp Phe Ser He Ala Asp Ala Ser Ser He Arg Glu Gly Asp Glu Phe 
50 55 60 65 



3847 



ACA TTA TCA ATG CCA CAT GTT TAT AGG ATT AAG CTA TTA AAC TCA TCG 
Thr Leu Ser Met Pro His Val Tyr Arg He Lys Leu Leu Asn Ser Ser 
70 75 80 



3895 



CAA ACA GCT ACT ATT TCC TTA GCG GAT GGT ACT GAG GCT TTC AAA TGC 
Gin Thr Ala Thr He Ser Leu Ala Asp Gly Thr Glu Ala Phe Lys Cys 
85 90 95 



3943 



TAT GTT TCG CAA CAG GCT GCA TAC TTG TAT GAA AAT ACT ACT TTC ACA 
Tyr Val Ser Gin Gin Ala Ala Tyr Leu Tyr Glu Asn Thr Thr Phe Thr 
100 105 HO 



3991 



TGT ACT GCT CAA AAT GAC CTG TCC TCC TAT AAT ACG ATT GAT GGA TCC 
Cys Thr Ala Gin Asn Asp Leu Ser Ser Tyr Asn Thr He Asp Gly Ser 
H5 120 125 



4039 



ATA ACA TTT TCG CTA AAT TTT AGT GAT GGT GGT TCC AGC TAT GAA TAT 
He Thr Phe Ser Leu Asn Phe Ser Asp Gly Gly Ser Ser Tyr Glu Tyr 
130 135 140 145 



4087 



GAG TTA GAA AAC GCT AAG TTT TTC AAA TCT GGG CCA ATG CTT GTT AAA 
Glu Leu Glu Asn Ala Lys Phe Phe Lys Ser Gly Pro Met Leu Val Lys 
150 155 160 



4135 



CTT GGT AAT CAA ATG TCA GAT GTG GTG AAT TTC GAT CCT GCT GCT TTT 



4183 
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Leu Gly Asn Gin Met Ser Asp Vol Val Aon Phe Asp Pro Ala Ala Phe 
165 170 175 

ACA GAG AAT GTT TTT CAC TCT GGG CGT TCA ACT GGT TAC GGT TCT TTT 4231 
Thr Glu Asn Val Phe Hie Ser Gly Arg Ser Thr Gly Tyr Gly Ser Phe 
180 185 190 

GAA AGT TAT CAT TTG GGT ATG TAT TGT CCA AAC GGA TAT TTC CTG GGT 4279 
Glu Ser Tyr His Leu Gly Met Tyr Cys Pro Asn Gly Tyr Phe Leu Gly 
195 200 205 

GGT ACT GAG AAG ATT GAT TAC GAC AGT TCC AAT AAC AAT GTC GAT TTG 4327 
Gly Thr Glu Lys He Asp Tyr Asp Ser Ser Asn Asn Asn Val Asp Leu 
210 215 220 225 

GAT TGT TCT TCA GTT CAG GTT TAT TCA TCC AAT GAT TTT AAT GAT TGG 4375 
Asp Cys Ser Ser Val Gin Val Tyr Ser Ser Asn Asp Phe Asn Asp Trp 
230 235 240 

TGG TTC CCG CAA AGT TAC AAT GAT ACC AAT GCT GAC GTC ACT TGT TTT 4423 
Trp Phe Pro Gin Ser Tyr Asn Asp Thr Asn Ala Asp Val Thr Cys Phe 
245 250 255 

GGT AGT AAT CTG TGG ATT ACA CTT GAC GAA AAA CTA TAT GAT GGG GAA 4471 
Gly Ser Asn Leu Trp He Thr Leu Asp Glu Lys Leu Tyr Asp Gly Glu 
260 265 270 

ATG TTA TGG GTT AAT GGA TTA CAA TCT CTA CCC GCT AAT GTA AAC ACA 4519 
Met Leu Trp Val Aon Ala Leu Gin Ser Leu Pro Ala Asn Val Asn Thr 
275 280 285 

ATA GAT CAT GCG TTA GAA TTT CAA TAC ACA TGC CTT GAT ACC ATA GCA 4567 
He Asp His Ala Leu Glu Phe Gin Tyr Thr Cya Leu Asp Thr He Ala 
290 295 300 305 

AAT ACT ACG TAC GCT ACG CAA TTC TCG ACT ACT AGG GAA TTT ATT GTT 4615 
Asn Thr Thr Tyr Ala Thr Gin Phe Ser Thr Thr Arg Glu Phe He Val 
310 315 320 

TAT CAG GGT CGG AAC CTC GGT ACA GCT ACC GCC AAA AGC TCT TTT ATC 466 3 

Tyr Gin Gly Arg Asn Leu Gly Thr Ala Ser Ala Lys Ser Ser Phe He 
325 330 335 

TCA ACC ACT ACT ACT GAT TTA ACA AGT ATA AAC ACT AGT GCG TAT TCC 4 711 

Ser Thr Thr Thr Thr Asp Leu Thr Ser He Asn Thr Ser Ala Tyr Ser 
340 345 3S0 

ACT GGA TCC ATT TCC ACA GTA GAA ACA GGC AAT CGA ACT ACA TCA GAA 4759 
Thr Gly Ser He Ser Thr Val Glu Thr Gly Asn Arg Thr Thr Ser Glu 
355 360 365 

GTG ATC AGT CAT GTG GTG ACT ACC AGC ACA AAA CTG TCT CCA ACT GCT 4 807 

Val He Ser His Val Val Thr Thr Ser Thr Lys Leu Ser Pro Thr Ala 
370 375 " 380 385 

ACT ACC AGC CTG ACA ATT GCA CAA ACC AGT ATC TAT TCT ACT GAC TCA 4855 
Thr Thr Ser Leu Thr He Ala Gin Thr Ser He Tyr Ser Thr Asp Ser 
390 395 400 

AAT ATC ACA GTA GGA ACA GAT ATT CAC ACC ACA TCA GAA GTG ATT AGT 4903 
Asn He Thr Val Gly Thr Asp He His Thr Thr Ser Glu Val He Ser 
405 410 415 

GAT GTG GAA ACC ATT AGC AGA GAA ACA GCT TCG ACC GTT GTA GCC GCT 4951 
Asp Val Glu Thr He Ser Arg Glu Thr Ala Ser Thr Val Val Ala Ala 
420 425 430 

CCA ACC TCA ACA ACT GGA TGG ACA GGC GCT ATG AAT ACT TAC ATC CCG 4999 
Pro Thr Ser Thr Thr Gly Trp Thr Gly Ala Met Asn Thr Tyr He Pro 
435 440 445 

CAA TTT ACA TCC TCT TCT TTC GCA ACA ATC AAC AGC ACA CCA ATA ATC 504 7 

Gin Phe Thr Ser Ser Ser Phe Ala Thr He- Asn Ser Thr Pro He He 
450 455 460 465 

TCT TCA TCA GCA GTA TTT GAA ACC TCA GAT GCT TCA ATT GTC AAT GTG 5095 
Ser Ser Ser Ala Val Phe Glu Thr Ser Asp Ala Ser He Val Asn Val 
470 475 480 



CAC ACT GAA AAT ATC ACG AAT ACT GCT GCT GTT CCA TCT GAA GAG CCC 



5143 



6,027,910 

29 30 

-continued 



His Thr Glu Ann lie Thr Asn Thr Ala Ala Val Pro Ser Glu Glu Pro 
485 490 495 

ACT TTT GTA AAT GCC ACG AGA AAC ICC TTA AAT TCC TTC TGC AGC AGC 5191 
Thr Phe Val Asn Ala Thr Arg Asn Ser Leu Asn Ser Phe Cye Ser Ser 
500 505 510 

AAA GAG CCA TCC AGT CCC TCA TCT TAT ACG TCT TCC CCA CTC GTA TCG 5239 
Lys Gin Pro Ser Ser Pro Ser Ser Tyr Thr Ser Ser Pro Leu Val Ser 
515 520 525 

TCC CTC TCC GTA AGC AAA ACA TTA CTA AGC ACC AGT TTT ACG CCT TCT 5287 
Ser Leu Ser Val Ser Lye Thr Leu Leu Ser Thr Ser Phe Thr Pro Ser 
530 535 540 545 

GTG CCA ACA TCT AAT ACA TAT ATC AAA ACG GAA AAT ACG GGT TAC TTT 5335 
Val Pro Thr Ser Asn Thr Tyr He Lye Thr Glu Asn Thr Gly Tyr Phe 
550 555 560 

GAG CAC ACG GCT TTG ACA ACA TCT TCA GTT GGC CTT AAT TCT TTT AGT 5383 
Glu His Thr Ala Leu Thr Thr Ser Ser Val Gly Leu Asn Ser Phe Ser 
565 570 575 

GAA ACA GCA CTC TCA TCT CAG GGA ACG AAA ATT GAC ACC TTT TTA GTG 5431 
Glu Thr Ala Leu Ser Ser Gin Gly Thr Lys He Asp Thr Phe Leu Val 
580 585 590 

TCA TCC TTG ATC GCA TAT CCT TCT TCT GCA TCA GGA AGC CAA TTG TCC 5479 
Ser Ser Leu He Ala Tyr Pro Ser Ser Ala Ser Gly Ser Gin Leu Ser 
595 600 605 

GGT ATC CAA CAG AAT TTC ACA TCA ACT TCT CTC ATG ATT TCA ACC TAT 5527 
Gly He Gin Gin Asn Phe Thr Ser Thr Ser Leu Met He Ser Thr Tyr 
610 615 620 625 

GAA GGT AAA GCG TCT ATA TTT TTC TCA GCT GAG CTC GGT TCG ATC ATT 5575 
Glu Gly Lys Ala Ser He Phe Phe Ser Ala Glu Leu Gly Ser He He 
630 635 ' 640 

TTT CTG CTT TTG TCG TAC CTG CTA TTC TAAAACGGGT ACTGTACAGT 5 622 

Phe Leu Leu Leu Ser Tyr Leu Leu Phe 
645 650 

TAGTACATTG AGTCGAAATA TACGAAATTA TTGTTCATAA TTTTCATCCT GGCTCTTTTT 5 682 

TTCTTCAACC ATAGTTAAAT GGACAGTTCA TATCTTAAAC TCTAATAATA CTTTTCTAGT S742 

TCTTATCCTT TTCCGTCTCA CCGCAGATTT TATCATAGTA TTAAATTTAT ATTTTGTTCG 5802 

TAAAAAGAAA AATTTGTGAG CGTTACCGCT CGTTTCATTA CCCGAAGGCT GTTTCAGTAG 5862 

ACCACTGATT AAGTAAGTAG ATGAAAAAAT TTCATCACCA TGAAAGAGTT CGATGAGAGC 5922 

TACTTTTTCA AAT GCT T AAC AGCTAACCGC CATTCAATAA TGTTACGTTC TCTTCATTCT 5982 

GCGGCTACGT TATCTAACAA GAGGTTTTAC TCTCTCATAT CTCATTCAAA TAGAAAGAAC 604 2 

ATAATCAAAA AGCTT 6057 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 650 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Phe Thr Phe Leu Lys He He Leu Trp Leu Phe Ser Leu Ala Leu 

1 5 io' 15 

Ala Ser Ala He Asn He Asn Asp He Thr Phe Ser Asn Leu Glu He 
20 25 30 

Thr Pro Leu Thr Ala Asn Lys Gin Pro Asp Gin Gly Trp Thr Ala Thr 
35 40 45 
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Phe Aep Phe Ser lie Ala Asp Ala Ser Ser He Arg Glu Gly Asp Glu 
50 55 60 

Phe Thr Leu Ser Met Pro His Val Tyr Arg lie Lys Leu Leu Asn Ser 
65 70 75 80 

Ser Gin Thr Ala Thr He Ser Leu Ala Asp Gly Thr Glu Ala Phe Lye 
85 90 95 

Cys Tyr Val Ser Gin Gin Ala Ala Tyr Leu Tyr Glu Asn Thr Thr Phe 
100 105 110 

Thr Cys Thr Ala Gin Asn Asp Leu Sec Ser Tyr Asn Thr He Asp Gly 
115 120 125 

Ser He Thr Phe Ser Leu Asn Phe Ser Asp Gly Gly Ser Ser Tyr Glu 
130 135 140 

Tyr Glu Leu Glu Asn Ala Lys Phe Phe Lys Ser Gly Pro Met Leu Val 
145 150 155 160 

Lys Leu Gly Asn Gin Met Ser Asp Val Val Asn Phe Asp Pro Ala Ala 
165 170 175 

Phe Thr Glu Asn Val Phe His Ser Gly Arg Ser Thr Gly Tyr Gly Ser 
180 195 190 

Phe Glu Ser Tyr His Leu Gly Met Tyr Cys Pro Asn Gly Tyr Phe Leu 
195 200 205 

Gly Gly Thr Glu Lys He Asp Tyr Asp Ser Ser Asn Asn Asn Val Asp 
210 215 220 

Leu Asp Cys Ser Ser Val Gin Val Tyr Ser Ser Asn Asp Phe Asn Asp 
225 230 235 240 

Trp Trp Phe Pro Gin Ser Tyr Asn Asp Thr Asn Ala Asp Val Thr Cys 
245 250 255 

Phe Gly Ser Asn Leu Trp He Thr Leu Asp Glu Lys Leu Tyr Asp Gly 
260 265 ' 270 

Glu Met Leu Trp Val Asn Ala Leu Gin Ser Leu Pro Ala Asn Val Asn 
275 280 285 

Thr He Asp His Ala Leu Glu Phe Gin Tyr Thr Cys Leu Asp Thr He 
290 295 ' 300 

Ala Asn Thr Thr Tyr Ala Thr Gin Phe Ser Thr Thr Arg Glu Phe He 
305 310 315 320 

Val Tyr Gin Gly Arg Asn Leu Gly Thr Ala Ser Ala Lys Ser Ser Phe 
325 „ 330 335 

He Ser Thr Thr Thr Thr Asp Leu Thr Ser He Asn Thr Ser Ala Tyr 
340 345 350 

Ser Thr Gly Ser He Ser Thr Val Glu Thr Gly Asn Arg Thr Thr Ser 
355 360 365 

Glu Val He Ser His Val Val Thr Thr Ser Thr Lys Leu Ser Pro Thr 
370 375 380 

Ala Thr Thr Ser Leu Thr He Ala Gin Thr Ser He Tyr Ser Thr Asp 
385 390 395 400 

Ser Asn He Thr Val Gly Thr Aap He His Thr Thr Ser Glu Val He 
405 410 415 

Ser Asp Val Glu Thr He Ser Arg Glu Thr Ala Ser Thr Val Val Ala 
420 425 430 

Ala Pro Thr Ser Thr Thr Gly Trp Thr Gly 1 . Ala Met Asn Thr Tyr He 
435 440 445 

Pro Gin Phe Thr Ser Ser Ser Phe Ala Thr He Asn Ser Thr Pro He 
450 455 460 

He Ser Ser Ser Ala Val Phe Glu Thr Ser Asp Ala Ser He Val Asn 
465 470 475 480 
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Vol Hia Thr Glu Asn lie Thr Asn Thr Ala Ala Val Pro Ser Glu Glu 
485 490 49S 

Pro Thr Phe Val Asn Ala Thr Arg Asn Ser Leu Asn Ser Phe Cys Ser 
500 505 510 

Ser Lys Gin Pro Ser Ser Pro Ser Ser Tyr Thr Ser Ser Pro Leu Val 
515 520 * 525 

Ser Ser Leu Ser Val Ser Lys Thr Leu Leu Ser Thr Ser Phe Thr Pro 
530 535 540 

Ser Val Pro Thr Ser Asn Thr Tyr lie Lys Thr Glu Asn Thr Gly Tyr 
545 550 555 560 

Phe Glu Hia Thr Ala Leu Thr Thr Ser Ser Val Gly Leu Asn Ser Phe 
565 570 575 

Ser Glu Thr Ala Leu Ser Ser Gin Gly Thr Lys He Asp Thr Phe Leu 
580 585 590 

Val Ser Ser Leu He Ala Tyr Pro Ser Ser Ala Ser Gly Ser Gin Leu 
595 600 605 

Ser Gly lie Gin Gin Asn Phe Thr Ser Thr Ser Leu Met He Ser Thr 
610 615 620 

Tyr Glu Gly Lys Ala Ser He Phe Phe Ser Ala Glu Leu Gly Ser He 
625 630 635 640 

He Phe Leu Leu Leu Ser Tyr Leu Leu Phe 
645 650 



(2) INFORMAT ION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipol 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
GGGGCGGCCG AGGTCTCGCA AGATCTGGA 



(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part non-coding strand lipase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TTTGTCCAGG TCTTGCGAGA CCTCTCGACG AAT 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



35 
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(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part coding strand lipase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TTCGGGTTAA TTGGGACATG TCTTTAGTGC GA 



32 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 40 base pairs 

(B ) TYPE: nucleic acid 

(C) STRANDS DN ESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipo2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

CCCCAAGCTT AAGGCTAGCA AGACATGTCC CAATTAACCC 



40 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Humicola lanuginosa 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 72.. 884 

(D) OTHER INFORMATION: /product- "lipase" 

(ix) FEATURE: 

(A) NAME /KEY: matjjeptide 

(B) LOCATION: 72.. 881 

(D) OTHER INFORMATION: /product- -lipase" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GAATTCGTAG CGACGATATG AGGAGCTCCC TTGTGCTGTT CTTTGTCTCT GCGTGGACGG 

*> 

CCTTGGCCAC G GCC GAG GTC TCG CAA GAT CTG TTT AAC CAG TTC AAT CTC 
Ala Glu Val Ser Gin Asp Leu Phe Asn Gin Phe Asn Leu 
1 5 10 

TTT GCA CAG TAT TCT GCT GCC GCA TAC TGC GGA AAA AAC AAT GAT GCC 
Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cya Gly Lys Asn Asn Asp Ala 
15 20 25 

CCA GCT GGT ACA AAC ATT ACG TGC ACG GGA AAT GCC TGC CCC GAG GTA 
Pro Ala Gly Thr Asn lie Thr Cys Thr Gly Asn Ala Cys Pro Glu Val 
30 35 40 4S 

GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG TTT GAA GAC TCT GGA GTG 
Glu Lys Ala Asp Ala Thr Phe Leu Tyr Ser Phe Glu Asp Ser Gly Val 
50 55 60 

GGC GAT GTC ACC GCC TTC CTT GCT CTA GAC AAC ACG AAC AAA TTG ATC 
Gly Asp Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys Leu He 

65 70 ■ \ 75 

GTC CTC TCT TTC CGT GGC TCT CGT TCC ATA GAA AAC TGG ATC GGA AAT 
Val Leu Ser Phe Arg Gly Ser Arg Ser He Glu Asn Trp He Gly Asn 
80 85 90 



60 
110 

158 

206 

254 

302 

350 



CTT AAC TTC GAC TTG AAA GAA ATA AAT GAC ATT TGC TCC GGC TGC AGG 
Leu Asn Phe Asp Leu Lys Glu He Aon Asp He Cys Ser Gly Cys Arg 



398 
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95 ioo 105 

GGA CAT GAC GGC TTC ACC TCG AGC TGG AGG TCT GTA GCC GAT ACG TTA 446 
Gly His Asp Gly Phe Thr Ser Ser Trp Arg Ser Val Ala Asp Thr Leu 
HO U5 120 125 

AGG CAG AAG GTG GAG GAT GCT GTG AGG GAG CAT CCC GAC TAT CGC GTG 494 
Arg Gin Lye Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr Arg Val 
130 135 140 

GTG TTT ACC GGA CAT AGC TTG GGT GGT GCA TTG GCA ACT GTT GCC GGA 542 
Val Phe Thr Gly His Ser Leu Gly Gly Ala Leu Ala Thr Val Ala Gly 
145 150 155 

GCA GAC CTG CGT GGA AAT GGG TAT GAC ATC GAC GTG TTT TCA TAT GGC 590 
Ala Asp Leu Arg Gly Asn Gly Tyr Asp He Asp Val Phe Ser Tyr Gly 
160 165 170 

GCC CCC CGA GTC GGA AAC AGG GCT TTT GCA GAA TTC CTG ACC GTA CAG 638 
Ala Pro Arg Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr Val Gin 
175 180 185 

ACC GGC GGT ACC CTC TAC CGC ATT ACC CAC ACC AAT GAT ATT GTC CCT 686 
Thr Gly Gly Thr Leu Tyr Arg He Thr His Thr Asn Asp He Val Pro 
190 195 200 205 

AGA CTC CCG CCG CGC GAG TTC GGT TAC AGC CAT TCT AGC CCA GAG TAC 734 
Arg Leu Pro Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro Glu Tyr 
210 215 220 

TGG ATC AAA TCT GGA ACC CTT GTC CCC GTC ACC CGA AAC GAC ATC GTG 782 
Trp He Lys Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp He Val 
225 230 235 

AAG ATA GAA GGC ATC GAT GCC ACC GGC GGC AAT AAC CAG CCT AAC ATT 830 
Lys He Glu Gly He Asp Ala Thr Gly Gly Asn Asn Gin Pro Asn He 
240 245 250 

CCG GAT ATC CCT GCG CAC CTA TGG TAC TTC GGG TTA ATT GGG ACA TGT 87 8 

Pro Asp He Pro Ala His Leu Trp Tyr Phe Gly Leu He Gly Thr Cys 
255 260 265 

CTT TAGTGCGAAG CTT 894 

Leu 

270 



<2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 270 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Ala Glu Val Ser Gin Asp Leu Phe Asn Gin Phe Asn Leu Phe Ala Gin 
15 10 15 

Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn Asp Ala Pro Ala Gly 
20 25 30 

Thr Asn He Thr Cys Thr Gly Asn Ala Cya Pro Glu Val Glu Lys Ala 
35 40 45 

Asp Ala Thr Phe Leu Tyr Ser Phe Glu Asp Ser Gly Val Gly Asp Val 
50 55 60 

Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys Leu He Val Leu Ser 
65 70 7S 80 

Phe Arg Gly Ser Arg Ser He Glu Asn Trp He Gly Asn Leu Asn Phe 
85 90 95 

Asp Leu Lys Glu He Asn Asp He Cys Ser Gly Cys Arg Gly His Asp 
100 105 110 



Gly Phe Thr Ser Ser Trp Arg Ser Val Ala Asp Thr Leu Arg Gin Lys 
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115 120 125 

Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr Arg Val Val Phe Thr 
130 135 140 

Gly His Ser Leu Gly Gly Ala Leu Ala Thr Val Ala Gly Ala Asp Leu 
145 150 1S5 160 

Arg Gly Asn Gly Tyr Asp lie Asp Val Phe Ser Tyr Gly Ala Pro Arg 
165 170 175 

Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr Val Gin Thr Gly Gly 
180 185 190 

Thr Leu Tyr Arg lie Thr His Thr Asn Asp He Val Pro Arg Leu Pro 
195 200 205 

Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro Glu Tyr Trp He Lys 
210 215 220 

Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp He Val Lyo He Glu 
225 230 235 240 

Gly He Asp Ala Thr Gly Gly Asn Asn Gin Pro Asn He Pro Asp He 
245 250 255 

Pro Ala His Leu Trp Tyr Phe Gly Leu He Gly Thr Cys Leu 
260 265 270 



(2) IN FORMAT ION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

ATCCCTGCGC ACCTATGGTA CTTCGGGTTA ATTGGGACAT GTCTTGCTAG CCTTA 55 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 59 base pairs 

(B) TYPE: nucleic acid „ 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: primer 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

AGCTTAAGGC TAGCAAGACA TGTCCCAATT AACCCGAAGT ACCATAGGTG CGCAGGGAT 59 



(2) INFORMATION FOR SEQ ID NO: 111 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1628 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA to mRNA. 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Geotrichum candidum 

(B) STRAIN: CMICC 335426 
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<ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 40.. 1731 

(D) OTHER INFORMATION: /product- - lipase " 

(ix) FEATURE: 

(A) NAME/KEY: sig.peptide 

(B) LOCATION: 40.. 96 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 97.. 1728 

(DJ OTHER INFORMATION: /product- -lipase" 
/gene- -lipB" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 11: 

AATTCGGCAC GAGATTCCTT TGATTTGCAA CTGTTAATC ATG GTT TCC AAA AGC 54 

Met Val Ser Lys Ser 
-19 -15 

TTT TTT TTG GCT GCG GCG CTC AAC GTA GTG GGC ACC TTG GCC CAG GCC 10 2 

Phe Phe Leu Ala Ala Ala Leu Aan Val Val Gly Thr Leu Ala Gin Ala 
-10 -5 1 

CCC ACG GCC GTT CTT AAT GGC AAC GAG GTC ATC TCT GGT GTC CTT GAG 150 
Pro Thr Ala Val Leu Asn Gly Asn Glu Val lie Ser Gly Val Leu Glu 
5 10 15 

GGC AAG GTT GAT ACC TTC AAG GGA ATC CCA TTT GCT GAC CCT CCT GTT 198 
Gly Lys Val Asp Thr Phe Lys Gly lie Pro Phe Ala Asp Pro Pro Val 
20 25 30 

GGT GAC TTG CGG TTC AAG CAC CCC CAG CCT TTC ACT GGA TCC TAC CAG 246 
Gly Asp Leu Arg Phe Lys His Pro Gin Pro Phe Thr Gly Ser Tyr Gin 
35 40 45 ' 50 

GGT CTT AAG GCC AAC GAC TTC AGC TCT GCT TGT ATG CAG CTT GAT CCT 294 
Gly Leu Lys Ala Asn Asp Phe Ser Ser Ala Cya Met Gin Leu Asp Pro 
55 60 65 

GGC AAT GCC TTT TCT TTG CTT GAC AAA GTA GTG GGC TTG GGA AAG ATT 342 
Gly Asn Ala Phe Ser Leu Leu Asp Lys Val Val Gly Leu Gly Lys lie 
70 75 80 

CTT CCT GAT AAC CTT AGA GGC CCT CTT TAT GAC ATG GCC CAG GGT AGT 390 
Leu Pro Asp Asn Leu Arg Gly Pro Leu Tyr Asp Met Ala Gin Gly Ser 
85 90 95 

GTC TCC ATG AAT GAG GAC TGT CTC TAC CTT AAC GTT TTC CGC CCC GCT 438 
Val Ser Met Asn Glu Asp Cys Leu Tyr Leu Asn Val Phe Arg Pro Ala 
100 105 110 

GGC ACC AAG CCT GAT GCT AAG CTC CCC GTC ATG GTT TGG ATT TAC GGT 486 
Gly Thr Lys Pro Asp Ala Lys Leu Pro Val Met Val Trp He Tyr Gly 
115 120 125 130 

GGT GCC TTT GTG TTT GGT TCT TCT GCT TCT TAC CCT GGT AAC GGC TAC 534 
Gly Ala Phe Val Phe Gly Ser Ser Ala Ser Tyr Pro Gly Asn Gly Tyr 
135 140 * 145 

GTC AAG GAG AGT GTG GAA ATG GGC CAG CCT GTT GTG TTT GTT TCC ATC 582 
Val Lys Glu Ser Val Glu Met Gly Gin Pro Val Val Phe Val Ser He 
150 155 160 

AAC TAC CGT ACC GGC CCC TAT GGA TTC TTG GGT GGT GAT GCC ATC ACC 630 
Asn Tyr Arg Thr Gly Pro Tyr Gly Phe Leu Gly Gly Asp Ala He Thr 
165 170 175 

GCT GAG GGC AAC ACC AAC GCT GGT CTG CAC GAC CAG CGC AAG GGT CTC 678 
Ala Glu Gly Asn Thr Asn Ala Gly Leu His Asp Gin Arg Lys Gly Leu 
180 185 190 

GAG TGG GTT AGC GAC AAC ATT GCC AAC TTT GGT GGT GAT CCC GAC AAG 726 
Glu Trp Val Ser Asp Asn He Ala Aan Phe Gly Gly Asp Pro Asp Lys 
195 200 205 210 

GTC ATG ATT TTC GGT GAG TCC GCT GGT GCC ATG AGT GTT GCT CAC CAG 774 
Val Met He Phe Gly Glu Ser Ala Gly Ala Met Ser Val Ala His Gin 
215 220 225 
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CTT GTT GCC TAC GGT GGT GAC AAC ACC TAC AAC GGA AAG CAG CTT TTC 822 
Leu Val Ala Tyr Gly Gly Asp Ann Thr Tyr Aan Gly Lys Gin Leu Phe 
230 235 240 

CAC TCT GCC ATT CTT CAG TCT GGC GGT CCT CTT CCT TAC TTT GAC TCT 870 
His Ser Ala lie Leu Gin Ser Gly Gly Pro Leu Pro Tyr Phe Asp Ser 
245 250 255 

ACT TCT GTT GGT CCC GAG AGT GCC TAC AGC AGA TTT GOT CAG TAT GCC 918 
Thr Ser Val Gly Pro Glu Ser Ala Tyr Ser Arg Phe Ala Gin Tyr Ala 
260 265 270 

GGA TGT GAC ACC AGT GCC AGT GAT AAT GAC ACT CTG GCT TGT CTC CGC 966 
Gly Cye Asp Thr Ser Ala Ser Aap Asn Asp Thr Leu Ala Cys Leu Arg 
275 280 285 290 

AGC AAG TCC AGC GAT GTC TTG CAC AGT GCG CAG AAC TCG TAT GAT CTT 1014 
Ser Lys Ser Ser Asp Val Leu His Ser Ala Gin Asn Ser Tyr Asp Leu 
295 300 305 

AAG GAC CTG TTT GGT CTG CTC CCT CAA TTC CTT GGA TTT GGT CCC AGA 1062 
Lys Asp Leu Phe Gly Leu Leu Pro Gin Phe Leu Gly Phe Gly Pro Arg 
310 315 320 

CCC GAC GGC AAC ATT ATT CCC GAT GCC GCT TAT GAG CTC TAC CGC AGC 1110 
Pro Aap Gly Aon lie He Pro Aap Ala Ala Tyr Glu Leu Tyr Arg Ser 
325 330 335 

GGT AGA TAC GCC AAG GTT CCC TAC ATT ACT GGC AAC CAG GAG GAT GAG 1158 
Gly Arg Tyr Ala Lys Val Pro Tyr He Thr Gly Asn Gin Glu Asp Glu 
340 345 350 

GGT ACT ATT CTT GCC CCC GTT GCT ATT AAT GCT ACC ACT ACT CCC CAT 1206 
Gly Thr lie Leu Ala Pro Val Ala He Asn Ala Thr Thr Thr Pro His 
355 360 365 370 

GTT AAG AAG TGG TTG AAG TAC ATT TGT AGC CAG GCT TCT GAC GCT TCG 1254 
Val Lys Lys Trp Leu Lys Tyr He Cys Ser Gin Ala Ser Asp Ala Ser 
375 380 385 

CTT GAT CGT GTT TTG TCG CTC TAC CCC GGC TCT TGG TCG GAG GGT TCA 1302 
Leu Asp Arg Val Leu Ser Leu Tyr Pro Gly Ser Trp Ser Glu Gly Ser 
390 395 400 

CCA TTC CGC ACT GGT ATT CTT AAT GCT CTT ACC CCT CAG TTC AAG CGC 1350 
Pro Phe Arg Thr Gly He Leu Aan Ala Leu Thr Pro Gin Phe Lys Arg 
405 410 415 

ATT GCT GCC ATT TTC ACT GAT TTG CTG TTC CAG TCT CCT CGT CGT GTT 1398 
He Ala Ala He Phe Thr Asp Leu Leu Phe Gin Ser Pro Arg Arg Val 
420 425 430 

ATG CTT AAC GCT ACC AAG GAC GTC AAC CGC TGG ACT TAC CTT GCC ACC 1446 
Met Leu Asn Ala Thr Lys Asp Val Asn Arg Trp Thr Tyr Leu Ala Thr 
435 440 445 450 

CAG CTC CAT AAC CTC GTT CCA TTT TTG GGT ACT TTC CAT GGC AGT GAT 149 4 

Gin Leu His Asn Leu Val Pro Phe Leu Gly Thr Phe His Gly Ser Asp 
455 460 465 

CTT CTT TTT CAA TAC TAC GTG GAC CTT GGC CCA TCT TCT GCT TAC CGC 1542 
Leu Leu Phe Gin Tyr Tyr Val Asp Leu Gly Pro Ser Ser Ala Tyr Arg 
- 470 475 480 

CGC TAC TTT ATC TCG TTT GCC AAC CAC CAC GAC CCC AAC GTT GGT ACC 1590 
Arg Tyr Phe He Ser Phe Ala Aan His His Asp Pro Aan Val Gly Thr 
485 490 495 

AAC CTC CAA CAG TGG GAT ATG TAC ACT GAT GCA GGC AAG GAG ATG CTT 1638 
Aan Leu Gin Gin Trp Aap Met Tyr Thr Aap Ala Gly Lys Glu Met Leu 
500 50S 510 

CAG ATT CAT ATG ATT GGT AAC TCT ATG AGA ACT GAC GAC TTT AGA ATC 1686 
Gin He His Met He Gly Asn Ser Met Arg Thr Asp Asp Phe Arg He 
515 520 525 530 

GAG GGA ATC TCG AAC TTT GAG TCT GAC GTT ACT CTC TTC GGT TAATCCCATT 1738 
Glu Gly He Ser Asn Phe Glu Ser Aap Val Thr Leu Phe Gly 

535 540 545 
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TAGCAAGTTT TGTGTATTTC AAGTATACCA GTTGATGTAA TATATCAATA GATTACAAAT 1798 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 563 amino acids 

(B) TYPE: amino acio 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Val Ser Lys Ser Phe Phe Leu Ala Ala Ala Leu Asn Val Val Gly 
-19 -15 -10 -5 

Thr Leu Ala Gin Ala Pro Thr Ala Val Leu Asn Gly Aan Glu Val lie 
1 5 10 

Ser Gly Val Leu Glu Gly Lys Val Asp Thr Phe Lys Gly He Pro Phe 
15 20 25 

Ala Asp Pro Pro Val Gly Asp Leu Arg Phe Lys His Pro Gin Pro Phe 
30 35 40 45 

Thr Gly Ser Tyr Gin Gly Leu Lys Ala Asn Asp Phe Ser Ser Ala Cys 
50 55 60 

Met Gin Leu Asp Pro Gly Asn Ala Phe Ser Leu Leu Asp Lys Val Val 
65 70 75 

Gly Leu Gly Lys He Leu Pro Asp Asn Leu Arg Gly Pro Leu Tyr Asp 
80 85 90 

Met Ala Gin Gly Ser Val Ser Met Asn Glu Asp Cys Leu Tyr Leu Asn 
95 100 105 

Val Phe Arg Pro Ala Gly Thr Lys Pro Asp Ala Lys Leu Pro Val Met 
110 115 120 125 

Val Trp He Tyr Gly Gly Ala Phe Val Phe Gly Ser Ser Ala Ser Tyr 
130 135 140 

Pro Gly Asn Gly Tyr Val Lys Glu Ser Val Glu Met Gly Gin Pro Val 
145 150 155 

Val Phe Val Ser He Asn Tyr Arg Thr Gly Pro Tyr Gly Phe Leu Gly 
160 165 170 

Gly Asp Ala He Thr Ala Glu Gly Asn Thr Asn Ala Gly Leu His Asp 
175 180 185 

Gin Arg Lys Gly Leu Glu Trp Val Ser Asp Asn He Ala Aan Phe Gly 
190 195 200 205 

Gly Asp Pro Asp Lys Val Met He Phe Gly Glu Ser Ala Gly Ala Met 
210 215 220 

Ser Val Ala His Gin Leu Val Ala Tyr Gly Gly Asp Asn Thr Tyr Asn 
225 230 * 235 

Gly Lys Gin Leu Phe His Ser Ala He Leu Gin Ser Gly Gly Pro Leu 
240 245 250 

Pro Tyr Phe Asp Ser Thr Ser Val Gly Pro Glu Ser Ala Tyr Ser Arg 
255 260 265 

Phe Ala Gin Tyr Ala Gly Cys Asp Thr Ser Ala Ser Asp Asn Asp Thr 

270 275 "'280 285 

Leu Ala Cys Leu Arg Ser Lys Ser ser Asp Val Leu His Ser Ala Gin 
290 295 300 

Asn Ser Tyr Asp Leu Lys Asp Leu Phe Gly Leu Leu Pro Gin Phe Leu 



TAATTAGTGA AAAAAAAAAA AAAAAAAAAC 



1828 



30S 



310 



315 
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Gly Phe Gly Pro Arg Pro Asp Gly Asn He He Pro Asp Ala Ala Tyr 
320 325 330 

Glu Leu Tyr Arg Ser Gly Arg Tyr Ala Lya Val Pro Tyr He Thr Gly 
335 340 345 

Asn Gin Glu Asp Glu Gly Thr He Leu Ala Pro Val Ala He Asn Ala 
350 355 360 365 

Thr Thr Thr Pro His Val Lys Lys Trp Leu Lys Tyr He Cys Ser Gin 
370 375 380 

Ala Ser Asp Ala Ser Leu Asp Arg Val Leu Ser Leu Tyr Pro Gly Ser 
385 390 395 

Trp Ser Glu Gly Ser Pro Phe Arg Thr Gly lie Leu Asn Ala Leu Thr 
400 405 410 

Pro Gin Phe Lys Arg He Ala Ala He Phe Thr Asp Leu Leu Phe Gin 
415 420 42S 

Ser Pro Arg Arg Val Met Leu Asn Ala Thr Lys Asp Val Asn Arg Trp 
430 435 440 445 

Thr Tyr Leu Ala Thr Gin Leu His Aan Leu Val Pro Phe Leu Gly Thr 
4S0 455 460 

Phe His Gly Ser Asp Leu Leu Phe Gin Tyr Tyr Val Asp Leu Gly Pro 
465 470 475 

Ser Ser Ala Tyr Arg Arg Tyr Phe He Ser Phe Ala Asn His His Asp 
480 485 490 

Pro Asn Val Gly Thr Asn Leu Gin Gin Trp Asp Met Tyr Thr Asp Ala 
495 500 S05 

Gly Lya Glu Met Leu Gin He His Met He Gly Asn Ser Met Arg Thr 
510 51S 520 525 

Asp Asp Phe Arg He Glu Gly He Ser Asn Phe Glu Ser Asp Val Thr 
530 535 540 

Leu Phe Gly 



(2) INFORMATION FOR SEQ ID NO: 13s 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipo3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13s 
GGGGCGGCCG CGCAGGCCCC AAGGCGGTCT CTCAAT 36 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 
(8) . TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part non-coding strand lipasell 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



ATTGAGAGAC CGCCGTGGGG CCTGGGCCAC 



30 
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(2) INFORMATION FOR SEQ ID NO: IS: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE : 

(B) CLONE: Fart coding strand lipasell 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 15: 

CARACTTTGA G AC TGACGTT AATCTCTACG GTTAAAAC 38 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipo4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CCCCGCTAGC ACCGTAGAGA TTAACGTCAG TC 32 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipoS 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CCCGCGGCCG CGAGCATTGA TGGTGGTATC ^ 3 0 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: part non-coding strand lipase 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GATACCACGA TCAATGCT 18 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nueleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



51 
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(ii) MOLECULE TYPE: DMA (genomic) 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part coding strand lipase 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



AACACAGGCC TCTGTACT 



18 



(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer lipo6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20; 

CCGCGCTAGC AGTACAGAGG CCTGTGTT 28 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2685 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharonjyces cerevisiae 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: pYY105 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..2685 

(D) OTHER INFORMATION: /product- "Flocculation protein" /gene- 



nFLOl- 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



ATG ACA ATG CCT CAT CGC TAT ATG TTT TTG GCA GTC TTT ACA CTT CTG 
Met Thr Met Pro His Arg Tyr Met Phe Leu Ala Val Phe Thr Leu Leu 
15 10 15 



48 



GCA CTA ACT AGT GTG GCC TCA GGA GCC ACA GAG GCG TGC TTA CCA GCA 
Ala Leu Thr Ser Val Ala Ser Gly Ala Thr Glu Ala Cys Leu Pro Ala 
20 25 30 



96 



GGC CAG AGG AAA AGT GGG ATG AAT ATA AAT TTT TAC CAG TAT TCA TTG 
Gly Gin Arg Lys Ser Gly Met Asn He Asn Phe Tyr Gin Tyr Ser Leu 
35 40 45 



144 



AAA GAT TCC TCC ACA TAT TCG AAT GCA GCA TAT ATG GCT TAT GGA TAT 

Lye Asp Ser Ser Thr Tyr Ser Asn Ale Ala Tyr Met Ala Tyr Gly Tyr 
50 5S 60 



192 



GCC TCA AAA ACC AAA CTA GGT TCT GTC GGA GGA CAA ACT GAT ATC TCG 
Ala Ser Lys Thr Lys Leu Gly Ser Val Gly Gly Gin Thr Asp He Ser 
65 70 75 80 



240 



ATT GAT TAT AAT ATT CCC TGT GTT AGT TCA TCA GGC ACA TTT CCT TGT 
He Asp Tyr Aen He Pro Cye Val Ser Ser Sec Gly Thr Phe Pro Cys 
85 90 95 



288 



CCT CAA GAA GAT TCC TAT GGA AAC TGG GGA TGC AAA GGA ATG GGT GCT 
Pro Gin Glu Asp Ser Tyr Gly Asn Trp Gly Cys Lys Gly Met Gly Ala 
100 105 110 



336 
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TGT TCT AAT AGT CAA GGA ATT GCA TAC TGG AGT ACT GAT TTA TTT GGT 38 4 

Cys Ser Asn Ser Gin Gly He Ala Tyr Trp Ser Thr Asp Leu Phe Gly 
11S 120 125 

TTC TAT ACT ACC CCA ACA AAC GTA ACC CTA GAA ATG ACA GGT TAT TTT 432 
Phe Tyr Thr Thr Pro Thr Asn Val Thr Leu Glu Met Thr Gly Tyr Phe 
130 135 140 

TTA CCA CCA CAG ACG GGT TCT TAC ACA TTC AAG TTT GCT ACA GTT GAC 480 
Leu Pro Pro Gin Thr Gly Ser Tyr Thr Phe Lys Phe Ala Thr Val Asp 
145 150 155 160 

GAC TCT GCA ATT CTA TCA GTA GGT GGT GCA ACC GCG TTC AAC TGT TGT 528 
Asp Ser Ala He Leu Ser Val Gly Gly Ala Thr Ala Phe Asn Cys Cys 
165 170 175 

GCT CAA CAG CAA CCG CCG ATC ACA TCA ACG AAC TTT ACC ATT GAC GGT 576 
Ala Gin Gin Gin Pro Pro He Thr Ser Thr Asn Phe Thr He Asp Gly 
180 185 190 

ATC AAG CCA TGG GGT GGA AGT TTG CCA CCT AAT ATC GAA GGA ACC GTC 62 4 

He Lys Pro Trp Gly Gly Ser Leu Pro Pro Asn He Glu Gly Thr Val 
195 200 205 

TAT ATG TAC GCT GGC TAC TAT TAT CCA ATG AAG GTT GTT TAC TCG AAC 672 
Tyr Met Tyr Ala Gly Tyr Tyr Tyr Pro Met Lys Val Val Tyr Ser Asn 
210 215 220 

GCT GTT TCT TGG GGT ACA CTT CCA ATT AGT GTG ACA CTT CCA GAT GGT 720 
Ala Val Ser Trp Gly Thr Leu Pro He Ser Val Thr Leu Pro Asp Gly 
225 230 23S 240 

ACC ACT GTA AGT GAT GAC TTC GAA GGG TAC GTC TAT TCC TTT GAC GAT 768 
Thr Thr Val Ser Asp Asp Phe Glu Gly Tyr Val Tyr Ser Phe Asp Asp 
245 250 255 

GAC CTA AGT CAA TCT AAC TGT ACT GTC CCT GAC CCT TCA AAT TAT GCT 816 
Asp Leu Ser Gin Ser Aan Cys Thr Val Pro Asp Pro Ser Asn Tyr Ala 
260 265 270 

GTC AGT ACC ACT ACA ACT ACA ACG GAA CCA TGG ACC GGT ACT TTC ACT 864 
Val Ser Thr Thr Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr 
275 280 285 

TCT ACA TCT ACT GAA ATG ACC ACC GTC ACC GGT ACC AAC GGC GTT CCA 912 
Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Val Pro 
290 295 300 

ACT GAC GAA ACC GTC ATT GTC ATC AGA ACT CCA ACC AGT GAA GGT CTA 960 
Thr Asp Glu Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu 
305 310 315 320 

ATC AGC ACC ACC ACT GAA CCA TGG ACT GGC ACT TTC ACT TCG ACT TCC 1008 
He Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser 
325 330 335 

ACT GAG GTT ACC ACC ATC ACT GGA ACC AAC GGT CAA CCA ACT GAC GAA 1056 
Thr Glu Val Thr Thr He Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu 
340 345 350 

ACT GTG ATT GTT ATC AGA ACT CCA ACC AGT GAA GGT CTA ATC AGC ACC 1104 
Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu He Ser Thr 
355 360 365 

ACC ACT GAA CCA TGG ACT GGT ACT TTC ACT TCT ACA TCT ACT GAA ATG 1152 
Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met 
370 375 380 

ACC ACC GTC ACC GGT ACT AAC GGT CAA CCA ACT GAC GAA ACC GTG ATT 1200 
Thr Thr Val Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu Thr Val He 
385 390 • 395 400 

GTT ATC AGA ACT CCA ACC AGT GAA GGT TTG GTT ACA ACC ACC ACT GAA 1248 
Val He Arg Thr Pro Thr Ser Glu Gly Leu Val Thr Thr Thr Thr Glu 
405 410 415 

CCA TGG ACT GGT ACT TTT ACT TCG ACT TCC ACT GAA ATG TCT ACT GTC 1296 
Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Ser Thr Val 
420 425 430 
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ACT GGA ACC AAT GGC TTG CCA ACT GAT GAA ACT GTC ATT GTT GTC AAA 1344 
Thr Gly Thr Asn Gly Leu Pro Thr Aap Glu Thr Val He Val Vol Lys 
435 440 445 

ACT CCA ACT ACT GCC ATC TCA TCC ACT TTG TCA TCA TCA TCT TCA GGA 1392 
Thr Pro Thr Thr Ala tie Ser Ser Ser Leu Ser Ser Ser Ser Ser Gly 
450 455 460 

CAA ATC ACC AGC TCT ATC ACG TCT TCG CGT CCA ATT ATT ACC CCA TTC 1440 
Gin He Thr Ser Ser He Thr Ser Ser Arg Pro He He Thr Pro Phe 
465 470 475 480 

TAT CCT AGC AAT GGA ACT TCT GTG ATT TCT TCC TCA GTA ATT TCT TCC 1488 
Tyr Pro Ser Asn Gly Thr Ser Val He Ser Ser Ser Val He Ser Ser 
485 490 495 

TCA GTC ACT TCT TCT CTA TTC ACT TCT TCT CCA GTC ATT TCT TCC TCA 1536 
Ser Val Thr Ser Ser Leu Phe Thr Ser Ser Pro Val He Ser Ser Ser 
500 505 510 

GTC ATT TCT TCT TCT ACA ACA ACC TCC ACT TCT ATA TTT TCT GAA TCA 1584 
Val Ha Ser Ser Ser Thr Thr Thr Ser Thr Ser He Phe Ser Glu Ser 
515 520 525 

TCT AAA TCA TCC GTC ATT CCA ACC AGT ACT TCC ACC TCT GGT TCT TCT 1632 
Ser Lys Ser Ser Val He Pro Thr Ser Ser Ser Thr Ser Gly Ser Ser 
530 535 540 

GAG AGC GAA ACG AGT TCA GCT GGT TCT GTC TCT TCT TCC TCT TTT ATC 1680 
Glu Ser Glu Thr Ser Ser Ala Gly Ser Val Ser Ser Ser Ser Phe He 
545 550 555 560 

TCT TCT GAA TCA TCA AAA TCT CCT ACA TAT TCT TCT TCA TCA TTA CCA 1728 
Ser Ser Glu Ser Ser Lys Ser Pro Thr Tyr Ser Ser Ser Ser Leu Pro 
565 570 575 

CTT GTT ACC AGT GCG ACA ACA AGC CAG GAA ACT GCT TCT TCA TTA CCA 1776 
Leu Val Thr Ser Ala Thr Thr Ser Gin Glu Thr Ala Ser Ser Leu Pro 
580 585 590 

CCT GCT ACC ACT ACA AAA ACG AGC GAA CAA ACC ACT TTG GTT ACC GTG 1824 
Pro Ala Thr Thr Thr Lys Thr Ser Glu Gin Thr Thr Leu Val Thr Val 
595 600 605 

ACA TCC TGC GAG TCT CAT GTG TGC ACT GAA TCC ATC TCC CCT GCG ATT 1872 
Thr Ser Cys Glu Ser His Val Cys Thr Glu Ser He Ser Pro Ala He 
610 615 620 

GTT TCC ACA GCT ACT GTT ACT GTT AGC GGC GTC ACA ACA GAG TAT ACC 1920 
Val Ser Thr Ala Thr Val Thr Val Ser Gly Val Thr Thr Glu Tyr Thr 
625 630 635 540 

ACA TGG TGC CCT ATT TCT ACT ACA GAG ACA ACA AAG CAA ACC AAA GGG 1968 
Thr Trp Cys Pro He Ser Thr Thr Glu Thr Thr Lys Gin Thr Lys Gly 
645 650 655 

ACA ACA GAG CAA ACC ACA GAA ACA ACA AAA CAA ACC ACG GTA GTT ACA 2016 
Thr Thr Glu Gin Thr Thr Glu Thr Thr Lya Gin Thr Thr Val Val Thr 
660 665 670 

ATT TCT TCT TGT GAA TCT GAC GTA TGC TCT AAG ACT GCT TCT CCA GCC 2064 
He Ser Ser Cys Glu Ser Asp Val Cys Ser Lye Thr Ala Ser Pro Ala 
675 ' 680 685 

ATT GTA TCT ACA AGC ACT GCT ACT ATT AAC GGC GTT ACT ACA GAA TAC 2112 
He Val Ser Thr Ser Thr Ala Thr He Asn Gly Val Thr Thr Glu Tyr 
690 695 700 

ACA ACA TGG TGT CCT ATT TCC ACC ACA GAA TCG AGG CAA CAA ACA ACG 2160 
Thr Thr Trp Cys Pro He Ser Thr Thr Glu Ser Arg Gin Gin Thr Thr 
705 710 : 715 720 

CTA GTT ACT GTT ACT TCC TGC GAA TCT GGT GTG TGT TCC GAA ACT GCT 2208 
Leu Val Thr Val Thr Ser Cys Glu Ser Gly Val Cys Ser Glu Thr Ala 
725 730 735 

TCA CCT GCC ATT GTT TCG ACG GCC ACG GCT ACT GTG AAT GAT GTT GTT 2256 
Ser Pro Ala He Val Ser Thr Ala Thr Ala Thr Val Asn Asp Val Val 
740 745 750 
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ACG GTC TAT CCT ACA TGG AGG CCA CAG ACT GCG AAT GAA GAG TCT GTC 230 4 

Thr Val Tyr Pro Thr Trp Arg Pro Gin Thr Ala Asn Glu Glu Ser Val 
755 760 765 

AGC TCT AAA ATG ARC AGT GCT ACC GGT GAG ACA ACA ACC AAT ACT TTA 2352 
Ser Ser Lys Met Asn Ser Alo Thr Gly Glu Thr Thr Thr Asn Thr Leu 
770 775 780 

GCT GCT GAA ACG ACT ACC AAT ACT GTA GCT GCT GAG ACG ATT ACC AAT 2400 
Ala Ala Glu Thr Thr Thr Asn Thr Val Ala Ala Glu Thr He Thr Asn 
785 790 795 BOO 

ACT GGA GCT GCT GAG ACG AAA ACA GTA GTC ACC TCT TCG CTT TCA AGA 2 44 8 

Thr Gly Ala Ala Glu Thr Lys Thr Val Val Thr Ser Ser Leu Ser Arg 
805 810 815 

TCT AAT CAC GCT GAA ACA CAG ACG GCT TCC GCG ACC GAT GTG ATT GGT 249 6 

Ser Asn His Ala Glu Thr Gin Thr Ala Ser Ala Thr Asp Val He Gly 
820 825 830 

CAC AGC AGT AGT GTT GTT TCT GTA TCC GAA ACT GGC AAC ACC AAG AGT 254 4 

His Ser Ser Ser Val Val Ser Val Ser Glu Thr Gly Asn Thr Lys Ser 
835 840 845 

CTA ACA AGT TCC GGG TTG AGT ACT ATG TCG CAA CAG CCT CGT AGC ACA 2592 
Leu Thr Ser Ser Gly Leu Ser Thr Met Ser Gin Gin Pro Arg Ser Thr 
850 855 860 

CCA GCA AGC AGC ATG GTA GGA TAT AGT ACA GCT TCT TTA GAA ATT TCA 2 64 0 

Pro Ala Ser Ser Met Val Gly Tyr Ser Thr Ala Ser Leu Glu He Ser 
865 870 875 880 

ACG TAT GCT GGC AGT GCA ACA GCT TAC TGG CCG GTA GTG GTT TAA 2685 
Thr Tyr Ala Gly Ser Ala Thr Ala Tyr Trp Pro Val Val Val 
885 890 



(2) IN FORMAT ION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 894 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<xi) SEQUENCE DESCRIPTION : SEQ ID NO: 22: 

Met Thr Met Pro His Arg Tyr Met Phe Leu Ala Val Phe Thr Leu Leu 
15 10 15 

Ala Leu Thr Ser Val Ala Ser Gly Ala Thr Glu Ala Cys Leu Pro Ala 
20 25 30 

Gly Gin Arg Lys Ser Gly Met Asn He Asn Phe Tyr Gin Tyr Ser Leu 
35 40 45 

Lys Asp Ser Ser Thr Tyr Ser Asn Ala Ala Tyr Met Ala Tyr Gly Tyr 
50 55 60 

Ala Ser Lys Thr Lys Leu Gly Ser Val Gly Gly Gin Thr Asp He Ser 
65 70 75 80 

He Asp Tyr Asn He Pro Cys Val Ser Ser Ser Gly Thr Phe Pro Cys 
85 90 95 

Pro Gin Glu Asp Ser Tyr Gly Asn Trp Gly Cys Lys Gly Met Gly Ala 
100 105 110 

Cys Ser Asn Ser Gin Gly He Ala Tyr Trp Ser Thr Asp Leu Phe Gly 
115 120 125 

Phe Tyr Thr Thr Pro Thr Aon Val Thr Leu Glu Met Thr Gly Tyr Phe 
130 135 140 

Leu Pro Pro Gin Thr Gly Ser Tyr Thr Phe Lys Phe Ala Thr Val Asp 
145 150 155 160 



Asp Ser Ala He Leu Ser Val Gly Gly Ala Thr Ala Phe Asn Cys Cys 
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165 170 175 

Ala Gin Gin Gin Pro Pro lie Thr Ser Thr Ann Phe Thr lie Asp Gly 
180 165 190 

He Lys Pro Trp Gly Gly Ser Leu Pro Pro Asn He Glu Gly Thr Val 
195 200 205 

Tyr Met Tyr Ala Gly Tyr Tyr Tyr Pro Met Lyo Val Val Tyr Ser Asn 
210 215 220 

Ala Val Ser Trp Gly Thr Leu Pro He Ser Val Thr Leu Pro Asp Gly 
225 230 235 240 

Thr Thr Val Ser Asp Asp Phe Glu Gly Tyr Val Tyr Ser Phe Asp Asp 
245 250 255 

Asp Leu Ser Gin Ser Asn Cys Thr Val Pro Asp Pro Ser Asn Tyr Ala 
260 265 270 

Val Ser Thr Thr Thr Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr 
275 280 285 

Ser Thr Ser Thr Glu Met Thr Thr Val Thr Gly Thr Asn Gly Val Pro 
290 295 300 

Thr Asp Glu Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu 
305 310 315 320 

He Ser Thr Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser 
325 330 335 

Thr Glu Val Thr Thr lie Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu 
340 345 350 

Thr Val He Val He Arg Thr Pro Thr Ser Glu Gly Leu He Ser Thr 
355 360 365 

Thr Thr Glu Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met 
370 375 380 

Thr Thr Vol Thr Gly Thr Asn Gly Gin Pro Thr Asp Glu Thr Val He 
385 390 395 400 

Val He Arg Thr Pro Thr Ser Glu Gly Leu Val Thr Thr Thr Thr Glu 
405 410 415 

Pro Trp Thr Gly Thr Phe Thr Ser Thr Ser Thr Glu Met Ser Thr Val 
420 425 430 

Thr Gly Thr Asn Gly Leu Pro Thr Asp Glu Thr Val He Val Val Lys 
435 440 445 

Thr Pro Thr Thr Ala He Ser Ser Ser Leu Ser Ser Ser Ser Ser Gly 
450 455 460 

Gin He Thr Ser Ser He Thr Ser Ser Arg Pro He He Thr Pro Phe 
465 470 475 480 

Tyr Pro Ser Asn Gly Thr Ser Val He Ser Ser Ser Val He Ser Ser 
485 490 495 

Ser Val Thr Ser Ser Leu Phe Thr Ser Ser Pro Val He Ser Ser Ser 
500 505 510 

Val He Ser Ser Ser Thr Thr Thr Ser Thr Ser He Phe Ser Glu Ser 
515 520 52S 

Ser Lys Ser Ser Val He Pro Thr Ser Ser Ser Thr Ser Gly Ser Ser 
530 535 540 

Glu Ser Glu Thr Ser Ser Ala Gly Ser Val Ser Ser Ser Ser Phe He 
545 550 ''.555 560 

Ser Ser Glu Ser Ser Lys Ser Pro Thr Tyr Ser Ser Ser Ser Leu Pro 
565 570 575 

Leu Val Thr Ser Ala Thr Thr Ser Gin Glu Thr Ala Ser Ser Leu Pro 
580 S8S 590 



6,027,910 

61 

-continued 



Pro 'Ala Thr Thr Thr Lys Thr Ser Glu Gin Thr Thr Leu Val Thr Val 
595 600 605 

Thr Ser Cye Glu Ser His Val Cys Thr Glu Ser He Ser Pro Ala He 
610 615 «20 

Val Ser Thr Ala Thr Val Thr Val Ser Gly Val Thr Thr Glu Tyr Thr 
625 630 635 640 

Thr Trp Cys Pro He Ser Thr Thr Glu Thr Thr Lys Gin Thr Lys Gly 
* * 645 650 655 

Thr Thr Glu Gin Thr Thr Glu Thr Thr Lys Gin Thr Thr Val Val Thr 
660 665 670 

He Ser Ser Cys Glu Ser Aap Val Cys Ser Lys Thr Ala Ser Pro Ala 
675 680 685 

He Val Ser Thr Ser Thr Ala Thr He Asn Gly Val Thr Thr Glu Tyr 
690 695 700 

Thr Thr Trp Cys Pro He Ser Thr Thr Glu Ser Arg Gin Gin Thr Thr 
705 710 715 720 

Leu Val Thr Val Thr Ser Cys Glu Ser Gly Val Cys Ser Glu Thr Ala 
725 730 735 

Ser Pro Ala He Val Ser Thr Ala Thr Ala Thr Val Asn Aap Val Val 
740 745 750 

Thr Val Tyr Pro Thr Trp Arg Pro Gin Thr Ala Asn Glu Glu Ser Val 
755 760 765 

Ser Ser LyB Met Asn Ser Ala Thr Gly Glu Thr Thr Thr Asn Thr Leu 
770 775 780 

Ala Ala Glu Thr Thr Thr Aan Thr Val Ala Ala Glu Thr He Thr Asn 
785 790 795 800 

Thr Gly Ala Ala Glu Thr Lys Thr Val Val Thr Ser Ser Leu Ser Arg 
805 810 815 

Ser Asn His Ala Glu Thr Gin Thr Ala Ser Ala Thr Aap Val He Gly 
820 825 830 

His Ser Ser Ser Val Val Ser Val Ser Glu Thr Gly Aan Thr Lys Ser 
835 840 845 

Leu Thr Ser Ser Gly Leu Ser Thr Met Ser Gin Gin Pro Arg Ser Thr 
850 855 860 

Pro Ala Ser Ser Met Val Gly Tyr Ser Thr Ala Ser Leu Glu He Ser 
865 870 „ 875 880 

Thr Tyr Ala Gly Ser Ala Thr Ala Tyr Trp Pro Val Val Val 
885 890 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer pcrflol 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GAATTCGCTA GCAATTATGC TGTCAGTACC 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DHA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part non-coding sequence FLO I 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

AGTGGTACTG ACAGCATAAT TTGA 24 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: Part coding sequence FLOl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AATAAAATTC GCGTTCTTTT TACG 24 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: primer pcrflo2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GAGCTCAAGC TTCGTAAAAA GAACGCGAAT T 31 



w i • 5 The fungus of claim 4, further comprising a sequence 

1 A C method for immobilizing an enzyme comprising encoding a signal peptide said sequence being operably 

recombinanmy pacing an enzyme or a factional frag- 45 tinted to said first po lynucleot.de such that the ransla.on 

ment thereof linked to the exterior of a host cell, said method product of said first polynucleotide * secreted to the cell 

comorisine localizing the enzyme or functional fragment wall of said fungus. 

S7ti» exteriofof the ceU wall of a fungus by linking 6. The fungus of claim 5, wherein the signa^ pepade is 

Z enzyme or the functional part thereof to the anchoring derived from a protein fleeted from .he 

part of a cell wall anchoring protein, which anchoring part *> glycosyl-phospbatidyl-iuositol GPI) anchonng protein 

TderivaSe from the cJrnnal part of said anchoring a-factor, a-agglutinm a-agg utinin 

protein yeasts, a-amylasc of Bacillus, and proteinases or tactic acia 

P Ya method according to claim 1 in which said fungus is bacteria. ^ ^ ^ ^ ^ ^ ^ of 

8 ^The method of claim 1, in which said fungus is selected 55 anchoring in the cell wall of said fungus is ^ kct ^ m ^ 

from the group consisting of yeasts belonging to the genera group consisung of a-agguumn, a ^T^ 

Candida Debaryomyces, Hansenula, Kluyveromyces, protein, and Major Cell Wall Protein of a ^ 

Pichia and Saccharomyces, and molds belonging to the 8. The fungus of claim 4, wherein the protei 

Tenera Aspergillus, Penicilhum and Rhizopus. catalytic activity is selected from the group consisung of a 

4. A fungus containing an expressible first' polynucleotide 60 hydrolytic enzyme and an oxido-reductase. 

compT^g^structural gene encoding a protein providing 9. The fungus of claim 8 wherein said hydrolytic enzyme 

[e^teS * K'fungus of claim 8 wherein said protein providing 
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second polynucleotide comprising a structural gene encod- 
ing said protein providing catalytic activity operably linked 
to a sequence encoding a signal peptide ensuring secretion 
of the expression product of said second polynucleotide 
which is operably linked to a regulatable promoter. 

12. The fungus of claim 11, wherein said second poly- 
nucleotide is present either in a separate vector than the first 
polynucleotide or is present in the chromosome of said 
fungus. 

13. The fungus of claim 4, having at least one of said 
polynucleotides integrated in its chromosome. 
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14. The fungus of claim 4, having said protein providing 
catalytic activity immobilized at the exterior of its cell wall. 

15. The fungus of claim 4, which is a yeast. 

16. A process for earring out an enzymatic process by 
5 using an immobilized catalytically active protein, wherein a 

substrate for said catalytically active protein is contacted 
with the fungus of claim 4. 

17. A process according to claim 16 in which the fungus 
is a yeast. 

10 
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